Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vqzpB-004hIu-2l for pgsql-bugs@arkaria.postgresql.org; Fri, 13 Feb 2026 20:31:38 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vqzpA-00GA88-2F for pgsql-bugs@arkaria.postgresql.org; Fri, 13 Feb 2026 20:31:36 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vqzpA-00GA80-0k for pgsql-bugs@lists.postgresql.org; Fri, 13 Feb 2026 20:31:36 +0000 Received: from mail-oa1-x2e.google.com ([2001:4860:4864:20::2e]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1vqzp7-00000000Xsr-2gAs for pgsql-bugs@lists.postgresql.org; Fri, 13 Feb 2026 20:31:35 +0000 Received: by mail-oa1-x2e.google.com with SMTP id 586e51a60fabf-408778a8ec4so1114566fac.0 for ; Fri, 13 Feb 2026 12:31:32 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1771014691; cv=none; d=google.com; s=arc-20240605; b=Mt9wPOmi2C/fN2uJfJon6EM4DByI2+KHAXhvyYpRejXLJ2LaoL277ep4xJfMt48YdX MerxYiP2vWMAqpyfZfVgIw2ueXV5pyRCcIpX6D1wbUdTOyVQs7a3I3e4TDRskhGvSUCV BBZ7jNm69MTu5Bd4SfYi/UAi1twUzAJ1ZaKSvouKkIPwK0MAbdgE48pouO+fiI7yAafE 1RZHquI2Oj8QQusCmuO/6cOvXWRkPbfw4Pc8S+Sm/UfzwT094nMStHC0Rj2C9IroS3PK ADDrGoIPsrJClG4tQTCzUMmB9eo4MdA+ihbLGAtfZo5CqMYD0llhQrYBhPLJcLG/NYZa aEMg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=to:subject:message-id:date:from:mime-version:dkim-signature; bh=ZXg+/AlUBkEgRO/aBdndVS1DAZ5TLMmTJPYa0f9xj48=; fh=/gQe77b11iMZdcPj/nJr/Ghqi6rQp5FPrPbdO93bmOA=; b=OKewEoGWUYFEO7LskhE9HjHdahu/UQmi1bA0GTHY8kikw5wSVSlvRoA+9UVKNlJG/T Tq872pStznQhj2n3q9ywH8IDAMOTZmlZ8+gajRJFO0UQ8JzH7Pziu3Q5DoxlNk9zSOIn iBHLgwU0GWkCseEKq5Lc21boddrwWRlqtjY5KkW9Fa0z7XnPc906VXMkD3kdsmXCSI8j x/vn75tok/e7hNSST0LnfEreJNrUOAxkwFD2r0MRUluT8oOGiTKM8TS0CHPphyDia3Tt LAID6VUd85jmzLQzAFROg70FRdUV1Q5/hlQI0tmVJ+U3ZmxIIXx+WY0B+YQoP8XGv5bJ x+2A==; darn=lists.postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=swebber-me.20230601.gappssmtp.com; s=20230601; t=1771014691; x=1771619491; darn=lists.postgresql.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=ZXg+/AlUBkEgRO/aBdndVS1DAZ5TLMmTJPYa0f9xj48=; b=UQcL88TqTX1R/AJWeMpypxn2JV7MbziHGrHejLqlm9csXg/hSRd9O0BRoio3LlwFRo kJQVC+7/AeUsFA1LLfl0qdn8UG2uFtcI0rN9ie71hk4DVZl5VdlgZFNE4mydaXcML7wd S8LHyln5CQu6DoQ7dl3QZhx7F+zz9BD+mTwMRfLMi2jwNQKKAKY4MtJm+KcXNJ+uIYtk 6+d6fa6klqER/ebY+AFdy17ZTqhcaqJhCHFcwrMmXHcTP3OHybmtyhTnIsJGGlieTLbJ 05t4KYunkTTeDccLxosWKdekyLKjRvAqaLFTPTD5O7Rwrqlj30xKs9c45ugDqBCcT5wd JFoA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1771014691; x=1771619491; h=to:subject:message-id:date:from:mime-version:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ZXg+/AlUBkEgRO/aBdndVS1DAZ5TLMmTJPYa0f9xj48=; b=Xz6MF6nBYfwIn18nU/tmbmh02Z+uSvW70wbY7EL2MoYHKJFEa6LSn0CY/cYhuiOklO cStv2ds0bqqxuAu38mbld+RwZdaejpmovwQGh2FHhLCeqZlemC4wcJeBtXy9Edol8Ydp LzOePS8g3q7+Vvr4Qd99YZ5SM2H36OOfZyHJVFnWrfmOK+Ot7ePEeuD5a7nvZUIaZl79 M7AZrCnqo45c1FLWsyipqCz13L9KP/DeTBTXlMsa1tVL+yK8SPLS3shOT9s/34vu6K6V m1H23E/UpZwRZVE/G3AlJg/1Fk1wzE4qWBMZOrUGxwYhuKpCk0NVNJ+1gWavwqRFbLKK dZOw== X-Gm-Message-State: AOJu0YyH6MrMNVYX8/jHoSzg29axD07PlCleOQvBJtrhX7OP+r/WN29E 0maaJZ1M1at5EZOBsFjULagkfvOb9Ue0c4C/VIUAYN0mvLIBrDrw6KZlnKHALIYZbb7PsVbTObt DjDKTuIhUlnovm1fxnbyBl77ELyN/Ckb7KLEIBsh7YtqDOsKgBhWhyINfDQ== X-Gm-Gg: AZuq6aL4rn2femKzzeEM+dkri6qLxTku0XEZB3ARADA0UurckHdXyOZgr2+diekxpwL 55hB1Cbn4w30rYPibWVraEa9qLvAlTDix1vIUmGdFA04RfaNyGL6CKCvVNNS2AQdHLuQPsM5C6x hCu9ujCbGs86UvFy/FtcQtcvRLhySZzIEumMrjlt1hBbVX/GP//NY0EiSn26aN+f4yEKvcz/hv9 yJyrcjtWPVSh43gjLLoCj5jBTiQeXiOAgKC0iZjPz6a4vg4f7A1CoEAetV1dc8kYko/JWRUQ0DU sPmgRQ6Y X-Received: by 2002:a05:6870:d612:b0:404:3569:5a45 with SMTP id 586e51a60fabf-40ef3fdcc4amr1743795fac.40.1771014690846; Fri, 13 Feb 2026 12:31:30 -0800 (PST) MIME-Version: 1.0 From: Sebastian Webber Date: Fri, 13 Feb 2026 17:31:18 -0300 X-Gm-Features: AZwV_QgmASBEnfx_fVrpNkWwq78EwqGW9wcNS-GwXy_1ZRwxVqgirwDUlyH8s-E Message-ID: Subject: 17.8 standby crashes during WAL replay from 17.5 primary: "could not access status of transaction" To: pgsql-bugs@lists.postgresql.org Content-Type: multipart/alternative; boundary="0000000000008a44f7064aba7df1" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --0000000000008a44f7064aba7df1 Content-Type: text/plain; charset="UTF-8" PostgreSQL version: 17.8 (standby), 17.5 (primary) Primary: PostgreSQL 17.5 (Debian 17.5-1.pgdg130+1) on aarch64-unknown-linux-gnu Standby: PostgreSQL 17.8 (Debian 17.8-1.pgdg13+1) on aarch64-unknown-linux-gnu Platform: Docker containers on macOS (Apple Silicon / aarch64), Docker Desktop Description ----------- A PostgreSQL 17.8 standby crashes during WAL replay when streaming from a 17.5 primary. The crash occurs after replaying a MultiXact/TRUNCATE_ID record followed by a MultiXact/CREATE_ID record. Steps to reproduce ------------------ 1. Start a 17.5 primary configured for streaming replication 2. Seed a database with ~2GB of data (tables with foreign key constraints) 3. Start a 17.5 standby via pg_basebackup, confirm streaming replication 4. Generate ~500K MultiXact IDs using concurrent SELECT ... FOR SHARE / FOR KEY SHARE on the same rows 5. Run VACUUM on the multixact-heavy tables (generates TRUNCATE_ID WAL records) 6. Stop the 17.5 standby 7. Continue generating ~2M additional MultiXact IDs on the primary (builds WAL backlog) 8. Start a 17.8 standby on the same data volume -- it begins replaying the WAL backlog 9. Standby crashes during replay An automated reproducer (Go program + shell scripts) is available at: https://gist.github.com/sebastianwebber/2cd25d298bfe85cabcd8d41f83591acb It requires Go 1.22+ and Docker. Typical runtime is ~10 minutes. go run main.go --cleanup Actual output (standby log) ---------------------------- The standby successfully replays multiple SLRU page boundaries with this pattern: DEBUG: next offsets page is not initialized, initializing it now CONTEXT: WAL redo at 3/28C148D8 for MultiXact/CREATE_ID: 856063 offset 6680130 nmembers 9: ... DEBUG: skipping initialization of offsets page 418 because it was already initialized on multixid creation CONTEXT: WAL redo at 3/28C149B8 for MultiXact/ZERO_OFF_PAGE: 418 This repeats for pages 408 through 418. Then a truncation occurs: DEBUG: replaying multixact truncation: offsets [1, 490986), offsets segments [0, 7), members [1, 3864017), members segments [0, 49) CONTEXT: WAL redo at 3/29D6D548 for MultiXact/TRUNCATE_ID: offsets [1, 490986), members [1, 3864017) The very next CREATE_ID crashes: FATAL: could not access status of transaction 858112 DETAIL: Could not read from file "pg_multixact/offsets/000D" at offset 24576: read too few bytes. CONTEXT: WAL redo at 3/2A3AB408 for MultiXact/CREATE_ID: 858111 offset 6695072 nmembers 5: 1048228 (sh) 1048271 (keysh) 1048316 (sh) 1048344 (keysh) 1048370 (sh) LOG: startup process (PID 29) exited with exit code 1 LOG: shutting down due to startup process failure Expected output --------------- The standby should successfully replay all WAL records and reach a consistent streaming state. Configuration (non-default on primary) -------------------------------------- wal_level = replica max_wal_senders = 10 max_connections = 1200 shared_buffers = 256MB wal_keep_size = 16GB autovacuum_multixact_freeze_max_age = 100000 vacuum_multixact_freeze_min_age = 1000 vacuum_multixact_freeze_table_age = 50000 Standby configured with log_min_messages = debug1. -- Sebastian Webber --0000000000008a44f7064aba7df1 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
PostgreSQL version: 17.8 (standby), 17.5 (primary)
Primary: PostgreSQL 17.5 (Debian 17.5-1.pgdg130+1) on aarch64-unknown-= linux-gnu
Standby: PostgreSQL 17.8 (Debian 17.8-1.pgdg13+1) on aarch64-u= nknown-linux-gnu

Platform: Docker containers on macOS (Apple Silicon= / aarch64), Docker Desktop


Description
-----------

A = PostgreSQL 17.8 standby crashes during WAL replay when streaming
from a = 17.5 primary. The crash occurs after replaying a
MultiXact/TRUNCATE_ID r= ecord followed by a MultiXact/CREATE_ID
record.


Steps to repr= oduce
------------------

1. Start a 17.5 primary configured for s= treaming replication
2. Seed a database with ~2GB of data (tables with f= oreign key
=C2=A0 =C2=A0constraints)
3. Start a 17.5 standby via pg_b= asebackup, confirm streaming
=C2=A0 =C2=A0replication
4. Generate ~50= 0K MultiXact IDs using concurrent SELECT ... FOR SHARE
=C2=A0 =C2=A0/ FO= R KEY SHARE on the same rows
5. Run VACUUM on the multixact-heavy tables= (generates TRUNCATE_ID
=C2=A0 =C2=A0WAL records)
6. Stop the 17.5 st= andby
7. Continue generating ~2M additional MultiXact IDs on the primary=
=C2=A0 =C2=A0(builds WAL backlog)
8. Start a 17.8 standby on the sam= e data volume -- it begins
=C2=A0 =C2=A0replaying the WAL backlog
9. = Standby crashes during replay

An automated reproducer (Go program + = shell scripts) is available at:
https://gist.github.com/sebast= ianwebber/2cd25d298bfe85cabcd8d41f83591acb

It requires Go 1.22+ = and Docker. Typical runtime is ~10 minutes.

=C2=A0 go run main.go --= cleanup


Actual output (standby log)
-------------------------= ---

The standby successfully replays multiple SLRU page boundaries w= ith
this pattern:

=C2=A0 DEBUG: =C2=A0next offsets page is not in= itialized, initializing it now
=C2=A0 CONTEXT: =C2=A0WAL redo at 3/28C14= 8D8 for MultiXact/CREATE_ID: 856063 offset 6680130 nmembers 9: ...
=C2= =A0 DEBUG: =C2=A0skipping initialization of offsets page 418 because it was= already initialized on multixid creation
=C2=A0 CONTEXT: =C2=A0WAL redo= at 3/28C149B8 for MultiXact/ZERO_OFF_PAGE: 418

This repeats for pag= es 408 through 418. Then a truncation occurs:

=C2=A0 DEBUG: =C2=A0re= playing multixact truncation: offsets [1, 490986), offsets segments [0, 7),= members [1, 3864017), members segments [0, 49)
=C2=A0 CONTEXT: =C2=A0WA= L redo at 3/29D6D548 for MultiXact/TRUNCATE_ID: offsets [1, 490986), member= s [1, 3864017)

The very next CREATE_ID crashes:

=C2=A0 FATAL:= =C2=A0could not access status of transaction 858112
=C2=A0 DETAIL: =C2= =A0Could not read from file "pg_multixact/offsets/000D" at offset= 24576: read too few bytes.
=C2=A0 CONTEXT: =C2=A0WAL redo at 3/2A3AB408= for MultiXact/CREATE_ID: 858111 offset 6695072 nmembers 5: 1048228 (sh) 10= 48271 (keysh) 1048316 (sh) 1048344 (keysh) 1048370 (sh)

=C2=A0 LOG: = =C2=A0startup process (PID 29) exited with exit code 1
=C2=A0 LOG: =C2= =A0shutting down due to startup process failure


Expected output<= br>---------------

The standby should successfully replay all WAL re= cords and reach a
consistent streaming state.


Configuration (= non-default on primary)
--------------------------------------

= =C2=A0 wal_level =3D replica
=C2=A0 max_wal_senders =3D 10
=C2=A0 max= _connections =3D 1200
=C2=A0 shared_buffers =3D 256MB
=C2=A0 wal_keep= _size =3D 16GB
=C2=A0 autovacuum_multixact_freeze_max_age =3D 100000
= =C2=A0 vacuum_multixact_freeze_min_age =3D 1000
=C2=A0 vacuum_multixact_= freeze_table_age =3D 50000

Standby configured with log_min_messages = =3D debug1.

= --
Sebastian Webber
--0000000000008a44f7064aba7df1--