Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wRmkK-002hKX-2L for pgsql-bugs@arkaria.postgresql.org; Tue, 26 May 2026 08:02:41 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wRmkI-003qeA-1P for pgsql-bugs@arkaria.postgresql.org; Tue, 26 May 2026 08:02:39 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wRmkH-003qe2-2n for pgsql-bugs@lists.postgresql.org; Tue, 26 May 2026 08:02:39 +0000 Received: from fhigh-b8-smtp.messagingengine.com ([202.12.124.159] helo=fhigh-c8-smtp.messagingengine.com) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1wRmkG-00000001U9r-12Z4 for pgsql-bugs@lists.postgresql.org; Tue, 26 May 2026 08:02:38 +0000 Received: from phl-compute-01.internal (phl-compute-01.internal [10.202.2.41]) by mailfhigh.stl.internal (Postfix) with ESMTP id 7ED307A0156; Tue, 26 May 2026 04:02:34 -0400 (EDT) Received: from phl-frontend-03 ([10.202.2.162]) by phl-compute-01.internal (MEProxy); Tue, 26 May 2026 04:02:34 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=paquier.xyz; h= cc:cc:content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm3; t=1779782554; x=1779868954; bh=9AwUInnufM gKIkOo4Mx1frYWMf13cQ0fe5BJOBwfdQI=; b=VtcFo9oVoQaKfgrkdACEJP7Ou0 TovBf0Tj4H9BCFyxX0sLZjI3TPd+fxrG/0s1CXzce6EwKY5fRL8pOv4Yc6qniJ6h eHG2rZp/5FcZrEc/bUj6L2+fNGnCWO71UPRaar2EzMGU6rTyxh9RKNKZ7R5U+/3k N+hgh1JfAh7Co2OLcwGBCZJSSUqpQSk/J1leTWvLMaidlGcUcU0TmPrNH1tR58vH GjkAx2YPJqoQ7U/Jb7hIjPz9RGKuAebbrKXmueYRMykoMry+tyKxmhr/ZmDt3qUZ dcZMST9xE7/hglkvjdz6WKMFrCdrdCYM8QVwcTNa20+SwRQZq+8zRTnGEmTg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm3; t= 1779782554; x=1779868954; bh=9AwUInnufMgKIkOo4Mx1frYWMf13cQ0fe5B JOBwfdQI=; b=exQFIC82i3rEdnPS7UaJBMsV6n2VFF9in9GJ+LAvdNRG4qs/BLB eZs+bQLZXx2Xhro92Lj8XhlNeLjP8lDv8YeNloyYbBSsg9XEZibzaW7kzvh8hYc+ NczujQIrWAnYtElPEa9g4UzmNMSygNKnOc2GOe+1oIHTYEimpWP1/GOg9pala4qZ NHWxUHX0qtB5U/DMt6JDy/4tT2PLZAA0zZIl/MPVlnET0Ae9/o46hC5/tny5oEnP BsYN9HhsEQmJu1FdCHTqBW49/Hkh3jiEpWpWjGqwWjWPFlqUtnwZ4UtDPAwARHN7 eycMiGmQM8tpMNZT8NtEt19b/dNgadTnbEA== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: dmFkZTE/ws58PmUFbi5ybvS81MYcqOyGx0qK5RRjv9lHRgDA1OvoPL8uOg5gW8GdaE/T9M xLOBdP8rswTSOnC6PbA3RgtK6bCrQoxgSmnDoBKYkQQmPqxGtZUmkkuO9ZMhZN92seBX0C 9Byz6DrqUj+uGeOlg10tD4Tz4qFHFnQVqMuMrU3YAhB7KGhpSQDgYODpY0ViGui9onW83G 8+OaRJj/9aICo4AqqtXhg4SWU7qAoBahkRmzBkhUazP2YtiVI4UehgOLEpjnKR8/BPj0cu ijqHc5bOD5atpRdI0SGeXK8xV4lrWUyVeU3EauGc6fFQy7svDtfso5QtAW1tiOo5H/0/bN G+Mhj4iAwsgnxdPgSIcJZdSo8h+wf1rBM9YrCjToWwS9i6YJkgJlP0kofPFbwaxIIu8b+x EVbcHDpeB9wxgqWE66SSvqDKZrNsmVqMDo4wsTF9jyVRg/6ulVR26cSKgGLi9gCKvO0flP XFF4dzJjKm4Jr5wLuLXnHgzIO9V9m28qo2XfJRjE8olfOAB71EQTiYjJEx6oUqJjMwlw1u y7vXo2zQTiSF8E1kIRoVAbZWNVceIG0JYd54UU20kKvxfwXmHHuGW7AuKap73Z95ULZfxe 8Nv5PUyH2cLeLaG2CmWfUfbk30ljBicZs4j5zBzHpXqW4A5eRfBvE1aSZ4/A X-ME-Proxy: Feedback-ID: i0fe9450f:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Tue, 26 May 2026 04:02:31 -0400 (EDT) Date: Tue, 26 May 2026 17:02:27 +0900 From: Michael Paquier To: Ayush Tiwari Cc: Radim Marek , Andrey Borodin , Heikki Linnakangas , Marko Tiikkaja , PostgreSQL mailing lists Subject: Re: BUG #19490: Streaming standby on 16.14 stops applying WAL on MultiXactOffsetSLRU when primary is 16.8 Message-ID: References: <19490-9c59c6a583513b99@postgresql.org> <46FE61C9-F273-45FD-BED7-0F8CDA6EB992@yandex-team.ru> <46DB3CAB-EA1C-41A5-9D6D-5F913A2AAF66@yandex-team.ru> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="wLi/lEw3Geenogkr" Content-Disposition: inline In-Reply-To: List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --wLi/lEw3Geenogkr Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, May 22, 2026 at 10:21:32PM +0530, Ayush Tiwari wrote: > I think the right fix is to remove that SimpleLruWriteAll() call while > keeping the missing-page initialization logic. The flush is only meant to > make SimpleLruDoesPhysicalPageExist() see pages that exist in SLRU buffers > but have not reached disk. In this fallback path, I don't see a way for > the tested next_pageno to be in that state: if RecordNewMultiXact() itself > initializes the page, it writes it synchronously with SimpleLruWritePage() > before setting last_initialized_offsets_page. FWIW, I'm having a couple of customers complaining about that as well, as cross-version physical replication is a thing for minor upgrade flows. This bug is making suddenly recovery disruptive for some folks out there. :( > I attached a small patch for REL_16_STABLE. The same self-deadlock patte= rn > is also present on PG 14 and 15. PG 17 and > 18 have the same compatibility call, but SLRU locking is banked > there, and RecordNewMultiXact() does not appear to hold the relevant bank > lock before calling SimpleLruWriteAll(), so I would not describe those > branches as having this exact self-deadlock, but needs more analysis. So your root argument is that while the SimpleLruWriteAll() is defensive, it is not actually necessary because it means that last_initialized_offsets_page is -1 we have not yet replayed ZERO_OFF_PAGE and that we have no dirty page that could make SimpleLruDoesPhysicalPageExis() return an incorrect result, which would be bad. I am not sure to agree that this assumption is correct all the time, see for example the WAL message mentioned in the thread that has led to 77dff5d937b1: https://www.postgresql.org/message-id/33319276-e4d0-4773-89e4-09084905fdb0%= 40iki.fi I can see mentioned this WAL sequence, which is possible because there is no strict ordering in the creation of the mxacts: ZERO_PAGE:2048 -> CREATE_ID:2048 -> CREATE_ID:2049 -> CREATE_ID:2047 Based on that, if we begin recovery after ZERO_PAGE:2048, we could finish with this kind of sequence: CREATE_ID:2048 -> CREATE_ID:2049 -> CREATE_ID:2047 Looking closer, last_initialized_offsets_page stays at -1. The page for 2048 was zeroed before the checkpoint by the earlier ZERO_PAGE:2048. CREATE_ID:2048 and CREATE_ID:2049 are created first. Then comes CREATE_ID:2047 which enters the last_initialized_offsets_page branch. If we don't have the WriteAll(), the page where the offsets of 2048 and 2049 are located gets zeroed while creating 2047, corrupting the existing state of 2048 and 2049. A different approach would be to release and re-acquire the MultiXactOffsetSLRULock while calling SimpleLruWriteAll(), and I think that it should be actually safe. Even if read-only backends evict dirty pages between the moment the lock is released and the moment it is re-acquired in SimpleLruWriteAll(), the pages would be would be written to disk due to the eviction, which is what we want for correctness. And only the startup process dirties offset pages during recovery, AFAIK. Thoughts? > Added both Andrey and Heikki in to-mail, since I'm not sure if this > is more extreme than the multixact offset issue we had with 16.12, or it > is at par with that. Indeed, let's wait for at least Heikki's input. =20 Anyway, for any fixes, I don't think that it would be a good idea to skip v17 and v18, relying on the SLRU bank locks to not conflict to bypass the WriteAll() conflict. Let's keep all the branches across v14~v18 in sync. -- Michael --wLi/lEw3Geenogkr Content-Type: application/pgp-signature; name=signature.asc -----BEGIN PGP SIGNATURE----- iQIzBAEBCgAdFiEEG72nH6vTowiyblFKnvQgOdbyQH0FAmoVU5MACgkQnvQgOdby QH1F4Q//a+CIBbfEIFN/ujhYt2NFc5XKS90RGe8kEMPa9hbSLGTaTN/tXwCFgYig RDgpd3+7onyHleeBqTv3aEbLFbT1EQ6xFKf8s6VkFgALB5XvmeEWSWTd0EjPk3e3 QTgemjxWcDEIzlnJNmtWOY0vjUYvBoCL/ua7H19xGcT4kvR6KNA8YYPYdmM1DwUz X0ThzJcIFq2Yibjntr1DV3srz4Wu6IHXuPoulYnyAh+s3gRkR2/t2rFnUjFJf1dl 4m1QxiyJp5QEYpiuWKoTCrBY5xBLCRTc436uqe1b/ijNURhdnEo7p7fvP18tyW24 6AOx46wco6OclTE0KgKeo3fq02AtjCMXKytocsGoB04SJ1HP9EbrbvYArj59zcB8 d05RD7gA8GnGOOFDOQpiCE77lqVTSx/rODLPP/q2MiAM2peP63AOESWgTU6FD1eo /fX3xgZApjBCLWiJqy8uIUMWqOxCZOV1vLgfM9bdbX9ZxIT3DmHg/Y3zyAXIeSMT 9EI+8fDwOYDlUx/iTq60XRH0RIZYEBeaIFM56xRPWdV8y05Od+v9R8MKG7jzTxkh 4uVT5BhhdRIRZ2ZwZJldeDYXxPYRMK3n5d6KE6q+P30Nlh6u+ib75/PugbgJ8xHk ji3OJ7ELKx8xtQoAYoPa+TGCIFEzH6XWpjYpRUa0w0d4UxN97i8= =vsza -----END PGP SIGNATURE----- --wLi/lEw3Geenogkr--