Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w2G9F-0008PZ-2n for pgsql-hackers@arkaria.postgresql.org; Mon, 16 Mar 2026 22:10:53 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1w2G9E-00Dkur-2W for pgsql-hackers@arkaria.postgresql.org; Mon, 16 Mar 2026 22:10:52 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w2G9E-00Dkuj-1d for pgsql-hackers@lists.postgresql.org; Mon, 16 Mar 2026 22:10:52 +0000 Received: from meesny.iki.fi ([2001:67c:2b0:1c1::201]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1w2G9A-000000005E2-38Ed for pgsql-hackers@postgresql.org; Mon, 16 Mar 2026 22:10:51 +0000 Received: from [10.0.2.15] (unknown [130.41.208.2]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange x25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: hlinnaka) by meesny.iki.fi (Postfix) with ESMTPSA id 4fZTmy5rsBzyQw; Tue, 17 Mar 2026 00:10:42 +0200 (EET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=iki.fi; s=meesny; t=1773699043; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Fvb/xAwE/l4Z7ci1ZI9X/52iKZWCGa8z4CQyEuYAF30=; b=MISO/ijQCEos0aVRWVilto3OY/q19IDubdOEPzOsrIhbCYnm6gspL4HoD0rdnpQ9I5ZMr+ fdTbb6UWVi7rGx2o/3ydD/AysYVxb6lL0lsmse9ESW62Y5Z4H5dgwUmMz4Rhhn6uo1+809 94SCVPuVRdfWHsv7FGRYJcHzI4wsaUE= ARC-Seal: i=1; a=rsa-sha256; d=iki.fi; s=meesny; cv=none; t=1773699043; b=DZEXfWz5/a4fWfRsjMUSuocAaDyBQMLCoFINRj55FMg9awMdf/SRmHPbAA+Etc4xpdduZW tAuNhy571gO94hSjAuTeca9Jy6MO3oMKjAR0tAcmP+uTxlf/ZoRu894s+5S4FuYSWkVS7e gTq23FdO1JQnp/YVz+tnxNWxhuGWKe4= ARC-Authentication-Results: i=1; ORIGINATING; auth=pass smtp.auth=hlinnaka smtp.mailfrom=hlinnaka@iki.fi ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=iki.fi; s=meesny; t=1773699043; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Fvb/xAwE/l4Z7ci1ZI9X/52iKZWCGa8z4CQyEuYAF30=; b=QxDbHB/K22eOjECVBNQRSo71sznkr0/+8WWZB50wmO7HUL9fdu61ME/EX6sQYboUBRHwVd 73EQFfdySCpx0IIfPk6plprW6xwBTKVnGCxsNXPdH0gC5y1Cagsew14oEMV7sWuRCWVzYv DWRsEDAEWfT1graeX+Xhkam2MWDkMNk= Message-ID: Date: Tue, 17 Mar 2026 00:10:41 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: Proposal: Prevent Primary/Standby SLRU divergence during MultiXact truncation To: Ayush Tiwari , pgsql-hackers@postgresql.org References: Content-Language: en-US From: Heikki Linnakangas In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On 16/03/2026 18:09, Ayush Tiwari wrote: > The Issue: > In TruncateMultiXact(), we write the truncation WAL record > (WriteMTruncateXlogRec) before we actually perform the truncation via > PerformOffsetsTruncation() -> SimpleLruTruncate(). > > The problem arises from the "apparent wraparound" safety check inside > SimpleLruTruncate(). If SlruScanDirectory() detects an apparent > wraparound, SimpleLruTruncate() safely bails out and skips unlinking the > SLRU segments on the primary, logging: could not truncate directory > "%s": apparent wraparound. > > However, the WAL record for the truncation has already been flushed. > Standbys replay this TRUNCATE_ID WAL record and blindly delete their > SLRU segments. At this point, the primary and standby have diverged. Replaying the record will perform the same sanity checks against wraparound as the primary does. Hmm, although why did I not apply commit 817f74600d to 'master', only backbranches? The bug that it fixed was related to minor version upgrade, and thus it was not needed on 'master', but the code change would nevertheless make a lot of sense on 'master' too. > The Impact: > If the standby is subsequently promoted to primary, any attempt to > access rows holding those older MultiXact IDs (which the original > primary decided to keep) will throw a FATAL: could not access status of > transaction error, effectively resulting in data loss / inaccessible > rows for the user. Have you been able to reproduce that? > While the recent commits address the immediate standby crash involving > latest_page_number during multixact_redo(), they don't seem to prevent > the primary from emitting a "false" WAL truncation record when it > abandons its own truncation. > > Proposed Approach: > It seems safer to only emit the WAL record if we are guaranteed to > follow through with the truncation. We could modify SimpleLruTruncate() > to perform its safety checks first and return a boolean indicating > whether the truncation is safe to proceed. TruncateMultiXact() would > then only call WriteMTruncateXlogRec() and proceed with physical > deletion if the check passes. > > I have attached a rough draft patch illustrating this sequence change. I agree that would probably be better. I'm not sure how straightforward it will be to implement though, I wouldn't want to add much extra code just for this. P.S. Thanks for looking into this! This is hairy stuff, more review is much appreciated. - Heikki