Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wSDSj-0032Iu-0V for pgsql-bugs@arkaria.postgresql.org; Wed, 27 May 2026 12:34:17 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wSDSh-008EGB-0f for pgsql-bugs@arkaria.postgresql.org; Wed, 27 May 2026 12:34:16 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wSDSg-008EFz-2Z for pgsql-bugs@lists.postgresql.org; Wed, 27 May 2026 12:34:15 +0000 Received: from mail-ej1-x634.google.com ([2a00:1450:4864:20::634]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1wSDSe-000000010pp-2tPJ for pgsql-bugs@lists.postgresql.org; Wed, 27 May 2026 12:34:14 +0000 Received: by mail-ej1-x634.google.com with SMTP id a640c23a62f3a-bd01481e592so1633129266b.2 for ; Wed, 27 May 2026 05:34:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1779885250; cv=none; d=google.com; s=arc-20240605; b=hoxE1PtFuU4Zdr3j1KrmGu/SF8ZtgGWLdtJZfOxl76fHFhdfOPfYQQ4cSxWe+lGWL7 rKl8rQtwnGZATEn+gBYjBhSq4TaRUoKucDneamQ+JorM8YwromawE+iDNk/JsdtbiK/g YZjbiNJFfZfzGOfWGxKsQQ00i1sjKlgJOlwbD0YddsMfR3y/vXIIEVq/45YXVvX6QW7D lKw5zyqNc+IzON8sE5jdvYKpqdq7ziRhuM42K3ag4C+jZXdNWsVfAAbXowJwvfiaDRfL bbHNH1xVulj3SjRZ2KjXCuAikPTvzG9jGh3xiKUXkQGAcpKKAhw6Fi5WSGugSP43MzR3 XjIQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=to:subject:message-id:date:from:mime-version:dkim-signature; bh=7kBaLSlKtMsQQt83nK/SU0xFGeGSFEi1H4ys1Ar43lo=; fh=/gQe77b11iMZdcPj/nJr/Ghqi6rQp5FPrPbdO93bmOA=; b=SnDw4lh0wV5SY+Aw6KmDA18DtxrvrejctglDmkprinrbn6QH2uyOQ3fZTKTA0LR5cm 6Td/NaWyGMBQ+0PWvvbpArk9LL/XJqYuUnXBVkad7MWQP2dSgGk9++21AanJOIGFwXiw k9te83cC1IIKWVbP6eIftJ2z1GV8/g218EaPTW4KrBM13XMphAMJNGogX8RCTzwk0z9J HyW5nnROnAbrweibCCkkcn6xOiGT1pHPdVFSoIhVCru60Az8QviQbQlGKTcbNm3GWYnT hJpdt5dIxOdPzZ+7LSYw6erVqf8wK2Jx4zBe4moJGFQ9cA4QvYGN9YvVV2yhTJQgGF8j VQGQ==; darn=lists.postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1779885250; x=1780490050; darn=lists.postgresql.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=7kBaLSlKtMsQQt83nK/SU0xFGeGSFEi1H4ys1Ar43lo=; b=e0B4JJGz5C/lpyPj7aT+cOIYmwRhAsnWAQmikU1NK+sLU6Nd+lggfHu8PlxB3tX5sm fnugksmFCg64wIWRbjTO6LdPvgBO0uuRQReom+BrsfreJJPq/TSKIRQVtHhirIJjoMCq W/ONXyyEiasEY83ufoppEZ2NEiet1hUKAr7ul9J5Y48NOvlBKLNPZcT8DSpdvnI1BgSL WxmqR3yw7xXyaBYL3crOLL/3oBJXw7UgvFiikm+uP4CzRhYr4boCABvaS7Y49n3eDNP/ WaT6/SUhFT41CfT08UED5inu+8K/ETPORq2qzil7gMcrDhc2+ZTfZxRZxHzYc2JM8ynS PfVw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779885250; x=1780490050; h=to:subject:message-id:date:from:mime-version:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=7kBaLSlKtMsQQt83nK/SU0xFGeGSFEi1H4ys1Ar43lo=; b=LyMrasZsX0XE5li7etI5wzhowvaVzQLZc4X1zma95UupqAI/Jga0M6eyWZvG/c+57a eZ6Qo2+z4oky1gF2lZ34gy5dVY/+XsQphucKiYLSgaCgAK7SXH4AukDJNWUaE3VMlLBk IIopTC4fzzndXEgp8WaPq84d2R06rsQDcplIAPHJtmFOaNI4Pcjix65ayDez7lg9Ikiw m7YkJqDdxVDrb3/jdC34UNLZv+nzR2/PKxX4WbPgXDD31oYOv1XPFS08vYB3rTT2gRYd 1VhSI6auxPLgK1Plcv2HWmJiYnFc6JRZjVJ9ovFeLyz0fGTa/Ldi8xKvZdhpTIBP5kbG TcHQ== X-Gm-Message-State: AOJu0Yxhv5Yp652znxl06z6HRHmeH9XBi//XIAyG/AI4edGlRLDexOj0 vTbSUPjIGevYglu6QmyImFCox2Ee+9rnZwxcHSUnyrhx9jHd/SBYcGvKOytDfYbiABhLMb4v9Wg yiNVpSq+y0xD4dFued3ZDc6yafvavjKrB0u1Q X-Gm-Gg: Acq92OFvj6dscKq3cMJSBmA75+Fs5xIF0Vv1ZqQa7d9VcWxO5Y3/VWTit5lqs5cywiE 8zzwuk54ymgkt+92vBenPdOaLO3ba2OSq/Lh/uJntmokgEfPwHmXhlIH9U1zW9hXROZR/BpHhPm b0JJZ57akVYp9AOhifxUb0jUuwYW+FlxpDv86pj3u66cesDovg2KEth6Pg04f2GZRLVCw5cVZg6 8c2FaM9MmSSxOjgdVsSAdQpXsmCxQtKTsswHLNUH0YE1JNSevmYW8RSN1J/l1ZvsVxzTymGywLZ qcpYSC247jx55wkzgvS357iybQo5nETzFGGMiCx8 X-Received: by 2002:a17:906:f049:b0:bdc:8c6b:4842 with SMTP id a640c23a62f3a-bdd25cecf04mr1250281166b.30.1779885249781; Wed, 27 May 2026 05:34:09 -0700 (PDT) MIME-Version: 1.0 From: Olegs Germanovs Date: Wed, 27 May 2026 15:33:58 +0300 X-Gm-Features: AVHnY4L_nnTAFs83Cea-GNStGwaIoNxvGAFogCx-_jQZh3xA259_PGBtwLZPf4U Message-ID: Subject: 16.14 regression: startup process self-deadlocks during multixact WAL replay in RecordNewMultiXact To: pgsql-bugs@lists.postgresql.org Content-Type: multipart/alternative; boundary="0000000000000de55b0652cbd425" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --0000000000000de55b0652cbd425 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi! *Bug summary:* After upgrading from 16.13 to 16.14, archive recovery of a basebackup hangs indefinitely during multixact WAL replay. The startup process blocks acquiring MultiXactOffsetSLRULock in EXCLUSIVE mode while already holding one LWLock. The lock has shared_count=3D1 with no exclusive holder, no other live process appears to hold it, and the same recovery completes successfully on 16.13. *Environment*: PostgreSQL: 16.14 (pgdg) OS: Ubuntu 22.04, kernel 6.8.0-1016-aws Arch: aarch64 (AWS Graviton) Backup tool: pgBackRest 2.53.1 (backup) =E2=86=92 2.58.0 (restore) Source: x86_64 cluster, Postgres version - 16.6 (Ubuntu 16.6-1.pgdg22.04+1) *Scenario*: archive recovery from pgBackRest. End-of-backup record not yet seen. Stalls during replay of WAL segment 0000000100006BEB00000031. Verified that the next segment is genuinely irrelevant: startup is not waiting for WAL =E2=80=94 it has a record in hand (frozen on the same frame across many gdb captures separated by minutes). *Stack of startup process *(PID 395003): #7 LWLockAcquire (lock=3D0xfdbf33f2f000, mode=3DLW_EXCLUSIVE) at storage/lmgr/lwlock.c:1314 #8 SimpleLruWriteAll (ctl=3DMultiXactOffsetCtlData, ...) at access/transam/slru.c:1174 #9 RecordNewMultiXact (multi=3D981215231, offset=3D2282786137, nmembers=3D2, members=3D...) at access/transam/multixact.c:944 #10 multixact_redo (record=3D...) at access/transam/multixact.c:3464 #11 ApplyWalRecord -> PerformWalRecovery -> StartupXLOG LWLock state at 0xfdbf33f2f000 (stable across 5+ snapshots): tranche =3D 14 (MultiXactOffsetSLRU) state.value =3D 0x61000000 =3D LW_FLAG_RELEASE_OK | LW_FLAG_HAS_WAITERS | shared_count=3D1 waiters =3D {head=3D524, tail=3D524} (one waiter) Critical evidence =E2=80=94 startup process holds exactly one LWLock: num_held_lwlocks =3D 1 *Combined with*: - No exclusive holder of the lock - shared_count =3D 1 - Checkpointer (PID 395001) and bgwriter (PID 395002) sitting idle in CheckpointerMain/BackgroundWriterMain WaitLatch loops, with no visible work pending - Same gdb stack frame frozen across captures separated by minutes - Zero CPU, zero I/O, ctx_switches not advancing =E2=86=92 The startup process is holding MultiXactOffsetSLRULock in SHARED = mode (acquired earlier in the RecordNewMultiXact path) and now requesting it in EXCLUSIVE mode via SimpleLruWriteAll. Since LWLocks cannot be upgraded shared=E2=86=92exclusive, this is a self-deadlock. Auxiliary process stacks (for completeness): Checkpointer (395001): epoll_pwait =E2=86=92 WaitLatch (timeout=3D15000) =E2=86=92 CheckpointerMain (checkpointer.c:535) Bgwriter (395002): epoll_pwait =E2=86=92 WaitLatch (timeout=3D10000) =E2=86=92 BackgroundWriterMain (bgwriter.c:336) Both are idle in their main loops; held_lwlocks was in gdb but neither process has any plausible reason to hold the SLRU lock. pg_controldata excerpt: Database cluster state: in archive recovery Backup start location: 6BEB/27000378 Minimum recovery ending location: 6BEB/31DCEBE0 Backup end location: 0/0 End-of-backup record required: yes NextMultiXactId: 981215122 (replay reached 981215231) NextMultiOffset: 2282785918 (replay reached 2282786137) oldestMultiXid: 964544775 Reproduction: - Restore basebackup + WAL via pgBackRest archive-get on aarch64 - Start cluster on 16.14: hangs as described, every time, same WAL position - Stop cluster, downgrade to 16.13 (same pgdg apt source), start: recovery completes successfully on identical PGDATA - No data or environment change between the two attempts I'm happy to apply test patches or capture additional diagnostics. Best wishes Olegs Germanovs --0000000000000de55b0652cbd425 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi!

Bug summary:
=C2=A0 After upgrading f= rom 16.13 to 16.14, archive recovery of a basebackup
=C2=A0 hangs indefi= nitely during multixact WAL replay. The startup process
=C2=A0 blocks ac= quiring MultiXactOffsetSLRULock in EXCLUSIVE mode while
=C2=A0 already h= olding one LWLock. The lock has shared_count=3D1 with no
=C2=A0 exclusiv= e holder, no other live process appears to hold it, and the
=C2=A0 same = recovery completes successfully on 16.13.

Environment:
=C2= =A0 PostgreSQL: =C2=A016.14 (pgdg)
=C2=A0 OS: =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0Ubuntu 22.04, kernel 6.8.0-1016-aws
=C2=A0 Arch: =C2=A0 =C2=A0= =C2=A0 =C2=A0aarch64 (AWS Graviton)
=C2=A0 Backup tool: pgBackRest 2.53= .1 (backup) =E2=86=92 2.58.0 (restore)
=C2=A0 Source: =C2=A0 =C2=A0 =C2= =A0x86_64 cluster, Postgres version - 16.6 (Ubuntu 16.6-1.pgdg22.04+1)=C2= =A0

Scenario: archive recovery from pgBackRest. End-of-backup= record not yet
seen. Stalls during replay of WAL segment 0000000100006B= EB00000031.
Verified that the next segment is genuinely irrelevant: star= tup is not
waiting for WAL =E2=80=94 it has a record in hand (frozen on = the same frame
across many gdb captures separated by minutes).

Stack of startup process (PID 395003):

=C2=A0 #7 =C2=A0LWLockAc= quire (lock=3D0xfdbf33f2f000, mode=3DLW_EXCLUSIVE)
=C2=A0 =C2=A0 =C2=A0 = =C2=A0 at storage/lmgr/lwlock.c:1314
=C2=A0 #8 =C2=A0SimpleLruWriteAll (= ctl=3DMultiXactOffsetCtlData, ...)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at access= /transam/slru.c:1174
=C2=A0 #9 =C2=A0RecordNewMultiXact (multi=3D9812152= 31, offset=3D2282786137,
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 nmembers=3D2, members=3D...)<= br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 at access/transam/multixact.c:944
=C2=A0 = #10 multixact_redo (record=3D...)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at access/= transam/multixact.c:3464
=C2=A0 #11 ApplyWalRecord -> PerformWalRecov= ery -> StartupXLOG

LWLock state at 0xfdbf33f2f000 (stable across = 5+ snapshots):
=C2=A0 tranche =3D 14 (MultiXactOffsetSLRU)
=C2=A0 sta= te.value =3D 0x61000000
=C2=A0 =C2=A0 =3D LW_FLAG_RELEASE_OK | LW_FLAG_H= AS_WAITERS | shared_count=3D1
=C2=A0 waiters =3D {head=3D524, tail=3D524= } =C2=A0 (one waiter)

Critical evidence =E2=80=94 startup process ho= lds exactly one LWLock:
=C2=A0 num_held_lwlocks =3D 1

Combined= with:
=C2=A0 - No exclusive holder of the lock
=C2=A0 - shared_c= ount =3D 1
=C2=A0 - Checkpointer (PID 395001) and bgwriter (PID 395002) = sitting idle
=C2=A0 =C2=A0 in CheckpointerMain/BackgroundWriterMain Wait= Latch loops, with no
=C2=A0 =C2=A0 visible work pending
=C2=A0 - Same= gdb stack frame frozen across captures separated by minutes
=C2=A0 - Ze= ro CPU, zero I/O, ctx_switches not advancing

=E2=86=92 The startup p= rocess is holding MultiXactOffsetSLRULock in SHARED mode
=C2=A0 (acquire= d earlier in the RecordNewMultiXact path) and now requesting
=C2=A0 it i= n EXCLUSIVE mode via SimpleLruWriteAll. Since LWLocks cannot be
=C2=A0 u= pgraded shared=E2=86=92exclusive, this is a self-deadlock.

Auxiliary= process stacks (for completeness):

=C2=A0 Checkpointer (395001):=C2=A0 =C2=A0 epoll_pwait =E2=86=92 WaitLatch (timeout=3D15000)
=C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =E2=86=92 CheckpointerMain= (checkpointer.c:535)
=C2=A0 Bgwriter (395002):
=C2=A0 =C2=A0 epoll_p= wait =E2=86=92 WaitLatch (timeout=3D10000)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =E2=86=92 BackgroundWriterMain (bgwriter.c:336)=

Both are idle in their main loops; held_lwlocks was <optimized o= ut> in
gdb but neither process has any plausible reason to hold the S= LRU lock.

pg_controldata excerpt:
=C2=A0 Database cluster state: = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 in archive recovery
=C2=A0 Backup sta= rt location: =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A06BEB/27000378
=C2= =A0 Minimum recovery ending location: 6BEB/31DCEBE0
=C2=A0 Backup end lo= cation: =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00/0
=C2=A0 End-o= f-backup record required: =C2=A0 =C2=A0yes
=C2=A0 NextMultiXactId: =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0981215122 (repla= y reached 981215231)
=C2=A0 NextMultiOffset: =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A02282785918 (replay reached 2282786137)=C2=A0 oldestMultiXid: =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 964544775

Reproduction:
=C2=A0 - Restore basebackup= + WAL via pgBackRest archive-get on aarch64
=C2=A0 - Start cluster on 1= 6.14: hangs as described, every time, same WAL
=C2=A0 =C2=A0 position=C2=A0 - Stop cluster, downgrade to 16.13 (same pgdg apt source), start:=C2=A0 =C2=A0 recovery completes successfully on identical PGDATA
=C2= =A0 - No data or environment change between the two attempts

I'm= happy to apply test patches or capture additional diagnostics.

Best wishes
Olegs Germanovs
--0000000000000de55b0652cbd425--