Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uHNKY-001622-3o for pgsql-admin@arkaria.postgresql.org; Tue, 20 May 2025 13:48:30 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1uHNKW-006oYP-9V for pgsql-admin@arkaria.postgresql.org; Tue, 20 May 2025 13:48:28 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uHNKV-006oYH-Qk for pgsql-admin@lists.postgresql.org; Tue, 20 May 2025 13:48:27 +0000 Received: from mail-yb1-xb2e.google.com ([2607:f8b0:4864:20::b2e]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1uHNKS-002xNJ-3D for pgsql-admin@lists.postgresql.org; Tue, 20 May 2025 13:48:27 +0000 Received: by mail-yb1-xb2e.google.com with SMTP id 3f1490d57ef6-e7b9972069dso2326895276.3 for ; Tue, 20 May 2025 06:48:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1747748903; x=1748353703; darn=lists.postgresql.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=FA4Z2bQew2KoPkWuaEi3kjJv0X/IZvgMK+PvCu/G1CM=; b=gpHHWnibKF/40aeoTBAQzGDndiQ+dyAUfXBs7aL/ldnflVAAyVV3purpVy++jqbL6M 9S9pvmd69hTbTCC2+18E90apkpiJ+vMsmjHG7jyNB1CCUZmmKgPOHANhY7X+Yumap3tX xAMMIuomt/qjCLvatuDK3AOnTdHeGZP/awWkOpwgI2tp3at+ju8Y+MehDouDA3l+Ieko YfBllCa7maJSGzqO1nbKywiPex01rPc7qyUX89JYguNGrHq0gc/CZhjU2PudCX2E4VnL v7J2lar21prTWWPbPy1rwrwzAsBUREbcVfYtRqVFrhRlcZAHKm3LVSR0M+lx+/gSM7bh m6og== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747748903; x=1748353703; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=FA4Z2bQew2KoPkWuaEi3kjJv0X/IZvgMK+PvCu/G1CM=; b=cBCxbdE3+MCq1ZyM/xgAdVmmWAFGIL2Ch5KuYQIS+rSh/dYagGhtcCZHKdHmTF/njH jNu1+Qhmq7ogvzf33gKwO0w5RnlFzLthPCsacJdhY+k2c9dN6rnzuWpUSE2PiVX5pEuP 7mbyR9WsrJLNTeN8HOQuYTn2ZQVZDv7q2p8EcfDN2+B7oUf8c0VnY2iYUIBKaJDLYaZA TIym2qfbJaK/vnG/8FXby8Uu5VjQ3EEsesM/+66csJ6L50yuyD6bKfP6UZGAVEzi3y1Z oUzUv6SQAAXurXinTcoOUjfr/LsqtDSxP3J87u+p/s0IDNtkw/zQ2YwT+hJml264pRHt Km2Q== X-Gm-Message-State: AOJu0YzK8Vl54EzWE0GTTdXLpDg49wg9MfYSVPLODxG9cHCYBfBLcOHO 6nexRgIJb9kec5IKFumdPqwiMUusiJUdkNXxh9o6u+kdavx1xTXJOV5Tc8mPhIwSPawjNJgGkfM 0++Xda6a6nefGxeaIrCCU4mnn7SSV8/QJHmgESgiHBw== X-Gm-Gg: ASbGncue81z6h6evGbWJRzL6w8BG7OOnJUSC8ou0j5t7kxrROVPjAyJLaQubEEtMtJ1 1YP+vfGAYE3P5nCydEOlDMjBx8MXdbaZM0akVvvGHXTkhAH0OFuBy4+7Hai0V0IRWnYgu3FTnMh Xad/lX6p5pIm/zDeMvnrVSXUtcLNoK8LNY X-Google-Smtp-Source: AGHT+IG+1SWiQpfuSuF86D4/vrVqtHeZ4M4nrotd5eZBz0ZsNQS/6cTSMeaqG4QyOOG2NYBL6zyOSo/8RlcKg9sTDP8= X-Received: by 2002:a05:6902:10c4:b0:e7b:9220:d5b4 with SMTP id 3f1490d57ef6-e7b9220d7c7mr15153024276.25.1747748903140; Tue, 20 May 2025 06:48:23 -0700 (PDT) MIME-Version: 1.0 From: =?UTF-8?B?0JDQvdGC0L7QvSDQk9C70YPRiNCw0LrQvtCy?= Date: Tue, 20 May 2025 16:48:02 +0300 X-Gm-Features: AX0GCFs0TpeZQLZ-_vC2We2mzfd5aUxk_bQ4jZEnDli_z-CZWyOyRj4C2lbSajA Message-ID: Subject: query hangs out To: pgsql-admin@lists.postgresql.org Content-Type: multipart/alternative; boundary="00000000000087474d063591804c" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --00000000000087474d063591804c Content-Type: text/plain; charset="UTF-8" Hi. I encountered a very strange behavior. For any query (even a simple count(*) to one specific table (a small 30MB table with 3 indexes, without any specific data types - everything is standard out of the box vanilla Postgres) - the query hangs dead. Waited more than 24 hours - the query did not complete). Similarly, the vacuum process to the table hangs. Only Kill -9 with a full restart helps I get a backtrace, from it - I then examined the pg_multixact directory, which at the time of the problem had swelled to 900MB and had several thousand files. I excluded long and inactive transactions, as well as prepared statements. The workaround in the end was this - truncate the table (it was successful), then vacuum freeze each DB, and after that the files from pg_multixact disappeared. What could it be? vacuum\freeze\mulitxact settings are default. At the same time, the value pg_database.datminmxid=1 Could the problem with the hang be related to the many old files in pg_multixact ? (judging by the backtrace - yes) postgresql 16.9 backtrace: #0 0x00007f83840d3bfa in clock_nanosleep@GLIBC_2.2.5 () from /lib64/libc.so.6 #1 0x00007f83840d8847 in nanosleep () from /lib64/libc.so.6 #2 0x00000000005b416c in pg_usleep (microsec=1000) at ../port/pgsleep.c:50 #3 pg_usleep (microsec=1000) at ../port/pgsleep.c:41 #4 GetMultiXactIdMembers (from_pgupgrade=, isLockOnly=, members=0x7ffd65135b40, multi=66664135) at access/transam/multixact.c:1393 #5 GetMultiXactIdMembers (multi=66664135, members=0x7ffd65135b40, from_pgupgrade=, isLockOnly=) at access/transam/multixact.c:1225 #6 0x0000000000a99df9 in MultiXactIdGetUpdateXid.constprop.0 (xmax=, t_infomask=) at access/heap/heapam.c:7073 #7 0x0000000000572355 in HeapTupleGetUpdateXid (tuple=0x7f8377fdceb0) at access/heap/heapam.c:7114 #8 HeapTupleSatisfiesVacuumHorizon (htup=, buffer=496, dead_after=0x7ffd65135c1c) at access/heap/heapam_visibility.c:1350 #9 0x0000000000577eee in heap_prune_satisfies_vacuum (buffer=496, tup=0x7ffd65135c20, prstate=0x7ffd65135ec0) at access/heap/pruneheap.c:504 #10 heap_page_prune (relation=relation@entry=0x7f83738fed70, buffer=buffer@entry=496, vistest=vistest@entry=0xd845f0 , old_snap_xmin=, old_snap_ts=, nnewlpdead=nnewlpdead@entry=0x7ffd65136ad0, off_loc=0x0) at access/heap/pruneheap.c:350 #11 0x0000000000578b91 in heap_page_prune_opt (relation=0x7f83738fed70, buffer=496) at access/heap/pruneheap.c:208 #12 0x00000000005625cf in heapgetpage (sscan=sscan@entry=0x1038218, block=block@entry=5) at access/heap/heapam.c:418 #13 0x0000000000563304 in heapgettup_pagemode (scan=scan@entry=0x1038218, dir=ForwardScanDirection, nkeys=0, key=0x0) at access/heap/heapam.c:885 #14 0x00000000005637e4 in heap_getnextslot (sscan=0x1038218, direction=, slot=0x1025410) at access/heap/heapam.c:1149 #15 0x0000000000727e8a in table_scan_getnextslot (slot=0x1025410, direction=ForwardScanDirection, sscan=) at executor/../../../src/include/access/tableam.h:1066 #16 SeqNext (node=0x1025280) at executor/nodeSeqscan.c:80 #17 0x000000000070bc41 in ExecProcNode (node=0x1025280) at executor/../../../src/include/executor/executor.h:273 #18 fetch_input_tuple (aggstate=aggstate@entry=0x1024c88) at executor/nodeAgg.c:562 #19 0x000000000070e313 in agg_retrieve_direct (aggstate=0x1024c88) at executor/nodeAgg.c:2460 #20 ExecAgg (pstate=0x1024c88) at executor/nodeAgg.c:2180 #21 0x00000000006f8512 in ExecProcNode (node=0x1024c88) at executor/../../../src/include/executor/executor.h:273 #22 ExecutePlan (execute_once=, dest=0x1073f50, direction=, numberTuples=0, sendTuples=, operation=CMD_SELECT, use_parallel_mode=, planstate=0x1024c88, estate=0x1024a70) at executor/execMain.c:1670 #23 standard_ExecutorRun (queryDesc=0x1042660, direction=, count=0, execute_once=) at executor/execMain.c:365 #24 0x00000000008c7cc5 in ExecutorRun (execute_once=, count=0, direction=ForwardScanDirection, queryDesc=0x1042660) at executor/execMain.c:309 #25 PortalRunSelect (portal=portal@entry=0xf96df0, forward=forward@entry=true, count=0, count@entry=9223372036854775807, dest=dest@entry=0x1073f50) at tcop/pquery.c:924 #26 0x00000000008c95c6 in PortalRun (portal=portal@entry=0xf96df0, count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true, run_once=run_once@entry=true, dest=dest@entry=0x1073f50, altdest=altdest@entry=0x1073f50, qc=0x7ffd65136f90) at tcop/pquery.c:768 #27 0x00000000008ca650 in exec_simple_query (query_string=0xed8430 "select count(*) from \"InboxState\";") at tcop/postgres.c:1274 #28 0x00000000008cc9df in PostgresMain (dbname=, username=) at tcop/postgres.c:4637 #29 0x000000000083c244 in BackendRun (port=0xf2eb50, port=0xf2eb50) at postmaster/postmaster.c:4464 #30 BackendStartup (port=0xf2eb50) at postmaster/postmaster.c:4192 #31 ServerLoop () at postmaster/postmaster.c:1782 #32 0x0000000000832c3d in PostmasterMain (argc=3, argv=0xe936d0) at postmaster/postmaster.c:1466 #33 0x000000000051bf51 in main (argc=3, argv=0xe936d0) at main/main.c:198 --00000000000087474d063591804c Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi.
I encountered a very strange behavior.
For any q= uery (even a simple count(*) to one specific table (a small 30MB table with= 3 indexes, without any specific data types - everything is standard out of= the box vanilla Postgres) - the query hangs dead. Waited more than 24 hour= s - the query did not complete).


Similarly, the vacuum proc= ess to the table hangs.
Only Kill -9 with a full restart helps

= I get a backtrace, from it - I then examined the pg_multixact directory, wh= ich at the time of the problem had swelled to 900MB and had several thousan= d files.
I excluded long and inactive transactions, as well as prepared = statements.

The workaround in the end was this - truncate the table = (it was successful), then vacuum freeze each DB, and after that the files f= rom pg_multixact disappeared.

What could it be? vacuum\freeze\mulitx= act=C2=A0 settings are default.
At the same time, the value pg_database.= datminmxid=3D1
Could the problem with the hang be related to the many ol= d files in pg_multixact ? (judging by the backtrace - yes)

postgresq= l 16.9

backtrace:
#0 =C2=A00x00007f83840d3bfa in clock_nanosleep@= GLIBC_2.2.5 () from /lib64/libc.so.6
#1 =C2=A00x00007f83840d8847 in nano= sleep () from /lib64/libc.so.6
#2 =C2=A00x00000000005b416c in pg_usleep = (microsec=3D1000) at ../port/pgsleep.c:50
#3 =C2=A0pg_usleep (microsec= =3D1000) at ../port/pgsleep.c:41
#4 =C2=A0GetMultiXactIdMembers (from_pg= upgrade=3D<optimized out>, isLockOnly=3D<optimized out>, member= s=3D0x7ffd65135b40, multi=3D66664135) at access/transam/multixact.c:1393#5 =C2=A0GetMultiXactIdMembers (multi=3D66664135, members=3D0x7ffd65135b40= , from_pgupgrade=3D<optimized out>, isLockOnly=3D<optimized out>= ;) at access/transam/multixact.c:1225
#6 =C2=A00x0000000000a99df9 in Mul= tiXactIdGetUpdateXid.constprop.0 (xmax=3D<optimized out>, t_infomask= =3D<optimized out>) at access/heap/heapam.c:7073
#7 =C2=A00x000000= 0000572355 in HeapTupleGetUpdateXid (tuple=3D0x7f8377fdceb0) at access/heap= /heapam.c:7114
#8 =C2=A0HeapTupleSatisfiesVacuumHorizon (htup=3D<opti= mized out>, buffer=3D496, dead_after=3D0x7ffd65135c1c) at access/heap/he= apam_visibility.c:1350
#9 =C2=A00x0000000000577eee in heap_prune_satisfi= es_vacuum (buffer=3D496, tup=3D0x7ffd65135c20, prstate=3D0x7ffd65135ec0) at= access/heap/pruneheap.c:504
#10 heap_page_prune (relation=3Drelation@en= try=3D0x7f83738fed70, buffer=3Dbuffer@entry=3D496, vistest=3Dvistest@entry= =3D0xd845f0 <GlobalVisDataRels.lto_priv.0>, old_snap_xmin=3D<optim= ized out>, old_snap_ts=3D<optimized out>, nnewlpdead=3Dnnewlpdead@= entry=3D0x7ffd65136ad0, off_loc=3D0x0)
=C2=A0 =C2=A0 at access/heap/prun= eheap.c:350
#11 0x0000000000578b91 in heap_page_prune_opt (relation=3D0x= 7f83738fed70, buffer=3D496) at access/heap/pruneheap.c:208
#12 0x0000000= 0005625cf in heapgetpage (sscan=3Dsscan@entry=3D0x1038218, block=3Dblock@en= try=3D5) at access/heap/heapam.c:418
#13 0x0000000000563304 in heapgettu= p_pagemode (scan=3Dscan@entry=3D0x1038218, dir=3DForwardScanDirection, nkey= s=3D0, key=3D0x0) at access/heap/heapam.c:885
#14 0x00000000005637e4 in = heap_getnextslot (sscan=3D0x1038218, direction=3D<optimized out>, slo= t=3D0x1025410) at access/heap/heapam.c:1149
#15 0x0000000000727e8a in ta= ble_scan_getnextslot (slot=3D0x1025410, direction=3DForwardScanDirection, s= scan=3D<optimized out>) at executor/../../../src/include/access/table= am.h:1066
#16 SeqNext (node=3D0x1025280) at executor/nodeSeqscan.c:80#17 0x000000000070bc41 in ExecProcNode (node=3D0x1025280) at executor/../.= ./../src/include/executor/executor.h:273
#18 fetch_input_tuple (aggstate= =3Daggstate@entry=3D0x1024c88) at executor/nodeAgg.c:562
#19 0x000000000= 070e313 in agg_retrieve_direct (aggstate=3D0x1024c88) at executor/nodeAgg.c= :2460
#20 ExecAgg (pstate=3D0x1024c88) at executor/nodeAgg.c:2180
#21= 0x00000000006f8512 in ExecProcNode (node=3D0x1024c88) at executor/../../..= /src/include/executor/executor.h:273
#22 ExecutePlan (execute_once=3D<= ;optimized out>, dest=3D0x1073f50, direction=3D<optimized out>, nu= mberTuples=3D0, sendTuples=3D<optimized out>, operation=3DCMD_SELECT,= use_parallel_mode=3D<optimized out>, planstate=3D0x1024c88, estate= =3D0x1024a70) at executor/execMain.c:1670
#23 standard_ExecutorRun (quer= yDesc=3D0x1042660, direction=3D<optimized out>, count=3D0, execute_on= ce=3D<optimized out>) at executor/execMain.c:365
#24 0x00000000008= c7cc5 in ExecutorRun (execute_once=3D<optimized out>, count=3D0, dire= ction=3DForwardScanDirection, queryDesc=3D0x1042660) at executor/execMain.c= :309
#25 PortalRunSelect (portal=3Dportal@entry=3D0xf96df0, forward=3Dfo= rward@entry=3Dtrue, count=3D0, count@entry=3D9223372036854775807, dest=3Dde= st@entry=3D0x1073f50) at tcop/pquery.c:924
#26 0x00000000008c95c6 in Por= talRun (portal=3Dportal@entry=3D0xf96df0, count=3Dcount@entry=3D92233720368= 54775807, isTopLevel=3DisTopLevel@entry=3Dtrue, run_once=3Drun_once@entry= =3Dtrue, dest=3Ddest@entry=3D0x1073f50, altdest=3Daltdest@entry=3D0x1073f50= , qc=3D0x7ffd65136f90) at tcop/pquery.c:768
#27 0x00000000008ca650 in ex= ec_simple_query (query_string=3D0xed8430 "select count(*) from \"= InboxState\";") at tcop/postgres.c:1274
#28 0x00000000008cc9df= in PostgresMain (dbname=3D<optimized out>, username=3D<optimized = out>) at tcop/postgres.c:4637
#29 0x000000000083c244 in BackendRun (p= ort=3D0xf2eb50, port=3D0xf2eb50) at postmaster/postmaster.c:4464
#30 Bac= kendStartup (port=3D0xf2eb50) at postmaster/postmaster.c:4192
#31 Server= Loop () at postmaster/postmaster.c:1782
#32 0x0000000000832c3d in Postma= sterMain (argc=3D3, argv=3D0xe936d0) at postmaster/postmaster.c:1466
#33= 0x000000000051bf51 in main (argc=3D3, argv=3D0xe936d0) at main/main.c:198<= /div> --00000000000087474d063591804c--