Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vrE2Z-009Ah2-2b for pgsql-bugs@arkaria.postgresql.org; Sat, 14 Feb 2026 11:42:24 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vrE2Y-00HTMd-0D for pgsql-bugs@arkaria.postgresql.org; Sat, 14 Feb 2026 11:42:22 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vrE2X-00HTMV-1k for pgsql-bugs@lists.postgresql.org; Sat, 14 Feb 2026 11:42:21 +0000 Received: from lahtoruutu.iki.fi ([2a0b:5c81:1c1::37]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1vrE2T-00000000akX-2rDb for pgsql-bugs@lists.postgresql.org; Sat, 14 Feb 2026 11:42:20 +0000 Received: from [10.0.2.15] (unknown [130.41.208.2]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange x25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: hlinnaka) by lahtoruutu.iki.fi (Postfix) with ESMTPSA id 4fCnFR2BmGz49Px6; Sat, 14 Feb 2026 13:42:02 +0200 (EET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=iki.fi; s=lahtoruutu; t=1771069324; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=nvuERE8qND6oc5EXsZCuZvvYVCYdyD3OMG27VINxocA=; b=bWQkGb4SD2I9Sod9MJDOBisYnoRO3ePdnua77g728fEInFsH9Ta0s/RNk5pnXclP06uF59 I3MUh8j8v0nKyIj+H5rc0wvI3JrtiZwcCROVWsGNu8MGzoYwbk2JlZdS/NkB8G+hsMYi5A iosOWKnu3YYqGQN6IOMf4sTqQLi/g4sH0Y4S+sl5sddnqG/2dNOYc2HWcNHvKOfEGxlzEz rJnewNMINhqrw+y31XoONnYHg6repMelRBeBIh5UbJrRvd/1A+x8ue7+7oKcPAjh8vZ2Bk PIso6+EU9WrwF+Q1UQ2Tl9gX3v4Ykfa2o7+GIqGpfwqdh0RN4F694Riud9PLXQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=iki.fi; s=lahtoruutu; t=1771069324; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=nvuERE8qND6oc5EXsZCuZvvYVCYdyD3OMG27VINxocA=; b=U6ijbqPhsNeI4ZJPBK9f7TLf3Xgh0th9uifVrRnrHzTm8R7a87KSndBY2Jovh3yUlOR81d jU+fVxDJu6//2Tn5UqLWzNGD+L6t0qQUWypmBZZwgkDfc6jm4BKyhpRQ6B8evdLWmu+Aa/ 3L6b/adrcjk2tnBttYR7XGcSchVxh4KN9pubtkDWXxy+9iXhgmEW7aW4cjVBfcQeQDnhMs CslsF6D6sf6JfCKnA+O3HDDyrSOSGqIkzFi9iUB+VRT8s4E6W7SyA36+PYPZY24Sb2792q Fh0za1zSnxnCnzKjYVHh4iKliOqJDJeWat5B+e6zi4JLk8WFn8m0FXBkGmFRAA== ARC-Authentication-Results: i=1; ORIGINATING; auth=pass smtp.auth=hlinnaka smtp.mailfrom=hlinnaka@iki.fi ARC-Seal: i=1; a=rsa-sha256; d=iki.fi; s=lahtoruutu; cv=none; t=1771069324; b=VsedCcERwtPfUU+NalLKmlFsmUQ2JvAm86dmHUog+2chObel4hnqpDF0pSn2ckTe74aaRi 78Xp4gWR74Lx5/eFuUMx+1YqySP84wCe9x+78YrtdQaFFT+voYAI0vGPdg218nYtGSJRQR dSRypFLQ42jNSObelj4awzw151zYF+U/nNAKLu1WIs0WYG95AHyvPglqQFiboKzvUXU8qr qOjuRXEOi/SjTcq2FhG+Qz4268VIb7pLazGgaXhBAevKkJL9LTpaRa8gXaw81QaHHsAikG sll/6xG0tECft5r3WzBkZEiIA/81RNOFqGgm4SmNo9CvzqW887MFXK0hMQrIdw== Content-Type: multipart/mixed; boundary="------------A6xV0C2mM2Tq0Vp3qdjXwj18" Message-ID: <349f9c82-3a8b-48ad-8cc4-fe81553793dd@iki.fi> Date: Sat, 14 Feb 2026 13:42:02 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: 17.8 standby crashes during WAL replay from 17.5 primary: "could not access status of transaction" To: Sebastian Webber , pgsql-bugs@lists.postgresql.org References: Content-Language: en-US From: Heikki Linnakangas Cc: Andrey Borodin , =?UTF-8?Q?=C3=81lvaro_Herrera?= , Dmitry Yurichev , Chao Li , Ivan Bykov , Kirill Reshke In-Reply-To: List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk This is a multi-part message in MIME format. --------------A6xV0C2mM2Tq0Vp3qdjXwj18 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit On 13/02/2026 22:31, Sebastian Webber wrote: > PostgreSQL version: 17.8 (standby), 17.5 (primary) > > Primary: PostgreSQL 17.5 (Debian 17.5-1.pgdg130+1) on aarch64-unknown- > linux-gnu > Standby: PostgreSQL 17.8 (Debian 17.8-1.pgdg13+1) on aarch64-unknown- > linux-gnu > > Platform: Docker containers on macOS (Apple Silicon / aarch64), Docker > Desktop > > > Description > ----------- > > A PostgreSQL 17.8 standby crashes during WAL replay when streaming > from a 17.5 primary. The crash occurs after replaying a > MultiXact/TRUNCATE_ID record followed by a MultiXact/CREATE_ID > record. Thanks for the report, I can repro it with your script. It is indeed a regression introduced in the latest minor release, in the logic to replay multixact WAL generated on older minor versions. (Commit 8ba61bc063). Adding the folks from the thread that led to that commit. The commit added this in RecordNewMultiXact(): > /* > * Older minor versions didn't set the next multixid's offset in this > * function, and therefore didn't initialize the next page until the next > * multixid was assigned. If we're replaying WAL that was generated by > * such a version, the next page might not be initialized yet. Initialize > * it now. > */ > if (InRecovery && > next_pageno != pageno && > pg_atomic_read_u64(&MultiXactOffsetCtl->shared->latest_page_number) == pageno) > { > elog(DEBUG1, "next offsets page is not initialized, initializing it now"); The idea is that if the next offset falls on a different page (next_pageno != pageno), and we have not yet initialized the next page (pg_atomic_read_u64(&MultiXactOffsetCtl->shared->latest_page_number) == pageno), we initialize it now. However, that last check goes wrong after a truncation record is replayed. Replaying a truncation record does this: > > /* > * During XLOG replay, latest_page_number isn't necessarily set up > * yet; insert a suitable value to bypass the sanity test in > * SimpleLruTruncate. > */ > pageno = MultiXactIdToOffsetPage(xlrec.endTruncOff); > pg_atomic_write_u64(&MultiXactOffsetCtl->shared->latest_page_number, > pageno); Thanks to that, latest_page_number moves backwards to much older page number. That breaks the "was the next offset page already initialized?" test in RecordNewMultiXact(). I don't understand why that "bypass the sanity check" is needed. As far as I can see, latest_page_number is tracked accurately during WAL replay, and should already be set up. It's initialized in StartupMultiXact(), and updated whenever the next page is initialized. That was introduced a long time ago, in commit 4f627f8973, which in turn was a backpatched and had deal with WAL that was generated before that commit. I suspect it was necessary back then, for backwards compatiblity, but isn't necessary any more. Hence, I propose to remove that "bypass the sanity check" code (attached). Does anyone see a scenario where latest_page_number might not be set correctly? If we want to play it even more safe -- and I guess that's the right thing to do for backpatching -- we could set latest_page_number *temporarily* while we do the the truncation, and restore the old value afterwards. This fixes the bug. With this fix, you can replay WAL that's already been generated. - Heikki --------------A6xV0C2mM2Tq0Vp3qdjXwj18 Content-Type: text/x-patch; charset=UTF-8; name="0001-Don-t-reset-latest_page_number-when-replaying-multix.patch" Content-Disposition: attachment; filename*0="0001-Don-t-reset-latest_page_number-when-replaying-multix.pa"; filename*1="tch" Content-Transfer-Encoding: base64 RnJvbSA1OTU1NmU1YjI0Zjc5NzNiODU3ZTU0ZTZmY2QxMzZkNDAxYzlmZjBmIE1vbiBTZXAg MTcgMDA6MDA6MDAgMjAwMQpGcm9tOiBIZWlra2kgTGlubmFrYW5nYXMgPGhlaWtraS5saW5u YWthbmdhc0Bpa2kuZmk+CkRhdGU6IFNhdCwgMTQgRmViIDIwMjYgMTM6MzA6MDMgKzAyMDAK U3ViamVjdDogW1BBVENIIDEvMV0gRG9uJ3QgcmVzZXQgJ2xhdGVzdF9wYWdlX251bWJlcicg d2hlbiByZXBsYXlpbmcgbXVsdGl4aWQKIHRydW5jYXRpb24KCidsYXRlc3RfcGFnZV9udW1i ZXInIGlzIHNldCB0byB0aGUgY29ycmVjdCB2YWx1ZSwgYWNjb3JkaW5nIHRvCm5leHRPZmZz ZXQsIGVhcmx5IGF0IHN5c3RlbSBzdGFydHVwLiBDb250cmFyeSB0byB0aGUgY29tbWVudCwg aXQgaGVuY2UKc2hvdWxkIGJlIHNldCB1cCBjb3JyZWN0bHkgYnkgdGhlIHRpbWUgd2UgZ2V0 IHRvIFdBTCByZXBsYXkuCgpUaGlzIGZpeGVzIGEgZmFpbHVyZSB0byByZXBsYXkgV0FMIGdl bmVyYXRlZCBvbiBvbGRlciBtaW5vciB2ZXJzaW9ucywKYmVmb3JlIGNvbW1pdCA3ODlkNjUz NjRjICgxOC4yLCAxNy44LCAxNi4xMiwgMTUuMTYsIDE0LjIxKS4KCkRpc2N1c3Npb246IGh0 dHBzOi8vd3d3LnBvc3RncmVzcWwub3JnL21lc3NhZ2UtaWQvMjAyNjAyMTQwOTAxNTAuR0My Mjk3QHA0Ni5kZWR5bi5pbztsaWdodG5pbmcucDQ2LmRlZHluLmlvCi0tLQogc3JjL2JhY2tl bmQvYWNjZXNzL3RyYW5zYW0vbXVsdGl4YWN0LmMgfCAxMCAtLS0tLS0tLS0tCiAxIGZpbGUg Y2hhbmdlZCwgMTAgZGVsZXRpb25zKC0pCgpkaWZmIC0tZ2l0IGEvc3JjL2JhY2tlbmQvYWNj ZXNzL3RyYW5zYW0vbXVsdGl4YWN0LmMgYi9zcmMvYmFja2VuZC9hY2Nlc3MvdHJhbnNhbS9t dWx0aXhhY3QuYwppbmRleCBjODYzZTRlMDU1Ni4uZTQ1ZWMwZDcyNDcgMTAwNjQ0Ci0tLSBh L3NyYy9iYWNrZW5kL2FjY2Vzcy90cmFuc2FtL211bHRpeGFjdC5jCisrKyBiL3NyYy9iYWNr ZW5kL2FjY2Vzcy90cmFuc2FtL211bHRpeGFjdC5jCkBAIC0zNTcxLDcgKzM1NzEsNiBAQCBt dWx0aXhhY3RfcmVkbyhYTG9nUmVhZGVyU3RhdGUgKnJlY29yZCkKIAllbHNlIGlmIChpbmZv ID09IFhMT0dfTVVMVElYQUNUX1RSVU5DQVRFX0lEKQogCXsKIAkJeGxfbXVsdGl4YWN0X3Ry dW5jYXRlIHhscmVjOwotCQlpbnQ2NAkJcGFnZW5vOwogCiAJCW1lbWNweSgmeGxyZWMsIFhM b2dSZWNHZXREYXRhKHJlY29yZCksCiAJCQkgICBTaXplT2ZNdWx0aVhhY3RUcnVuY2F0ZSk7 CkBAIC0zNTk2LDE1ICszNTk1LDYgQEAgbXVsdGl4YWN0X3JlZG8oWExvZ1JlYWRlclN0YXRl ICpyZWNvcmQpCiAJCVNldE11bHRpWGFjdElkTGltaXQoeGxyZWMuZW5kVHJ1bmNPZmYsIHhs cmVjLm9sZGVzdE11bHRpREIsIGZhbHNlKTsKIAogCQlQZXJmb3JtTWVtYmVyc1RydW5jYXRp b24oeGxyZWMuc3RhcnRUcnVuY01lbWIsIHhscmVjLmVuZFRydW5jTWVtYik7Ci0KLQkJLyoK LQkJICogRHVyaW5nIFhMT0cgcmVwbGF5LCBsYXRlc3RfcGFnZV9udW1iZXIgaXNuJ3QgbmVj ZXNzYXJpbHkgc2V0IHVwCi0JCSAqIHlldDsgaW5zZXJ0IGEgc3VpdGFibGUgdmFsdWUgdG8g YnlwYXNzIHRoZSBzYW5pdHkgdGVzdCBpbgotCQkgKiBTaW1wbGVMcnVUcnVuY2F0ZS4KLQkJ ICovCi0JCXBhZ2VubyA9IE11bHRpWGFjdElkVG9PZmZzZXRQYWdlKHhscmVjLmVuZFRydW5j T2ZmKTsKLQkJcGdfYXRvbWljX3dyaXRlX3U2NCgmTXVsdGlYYWN0T2Zmc2V0Q3RsLT5zaGFy ZWQtPmxhdGVzdF9wYWdlX251bWJlciwKLQkJCQkJCQlwYWdlbm8pOwogCQlQZXJmb3JtT2Zm c2V0c1RydW5jYXRpb24oeGxyZWMuc3RhcnRUcnVuY09mZiwgeGxyZWMuZW5kVHJ1bmNPZmYp OwogCiAJCUxXTG9ja1JlbGVhc2UoTXVsdGlYYWN0VHJ1bmNhdGlvbkxvY2spOwotLSAKMi40 Ny4zCgo= --------------A6xV0C2mM2Tq0Vp3qdjXwj18--