Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vMRcw-009vqg-1T for pgsql-hackers@arkaria.postgresql.org; Fri, 21 Nov 2025 13:56:42 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vMRct-007eqZ-2k for pgsql-hackers@arkaria.postgresql.org; Fri, 21 Nov 2025 13:56:40 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vMRct-007eqJ-1O for pgsql-hackers@lists.postgresql.org; Fri, 21 Nov 2025 13:56:39 +0000 Received: from lahtoruutu.iki.fi ([185.185.170.37]) by makus.postgresql.org with smtp (Exim 4.96) (envelope-from ) id 1vMRcq-000hns-39 for pgsql-hackers@lists.postgresql.org; Fri, 21 Nov 2025 13:56:38 +0000 Received: from [10.0.2.15] (unknown [130.41.208.2]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange x25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: hlinnaka) by lahtoruutu.iki.fi (Postfix) with ESMTPSA id 4dCcFt1mZJz49Q3d; Fri, 21 Nov 2025 15:56:34 +0200 (EET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=iki.fi; s=lahtoruutu; t=1763733395; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qY7PZgmPogjdjbHcAc59GzPV0ny252EmzLY36iOgJd8=; b=teaa9tUKgsdai7nXsmmEDlEGCDIXQmYYs9737lxhOPzJYhLdrEv7iE3Tr3Jggs1215O/Ct AEBfHKlLjBbMXEXmqQQdaDV9hsojvJhcppjJ/aoSnN7dKmy2vNe4AUF1Mu4O47QJqkOYy+ RXbox7I29bsWEmmEGwPHuNRGxGOgtGXeZP6kX3qxaFT+EqZOiMtqA/8hgzrssN5OP5eG4T 6yvcP59W3dixBcwTX0yjJDz5aSqWDChCyBf4T9f3XQy5joID7T86F2jrXkD/F1Zfi8XT4O VsRJMAQgwvUniiAd+fMkTel8yssioS8TPBEYBEL7b5rJohBrFZ7lOSm3+mI20w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=iki.fi; s=lahtoruutu; t=1763733395; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qY7PZgmPogjdjbHcAc59GzPV0ny252EmzLY36iOgJd8=; b=L5un9CiBozIxQwb1P10EAoazKgZhW0Nz0Gr9l8N9k7CHYbYliGa/so0Q3t+r7aqNBprI1B NfDRZDV63Bb/aO+Z796oobXH0f98QYFuzEH00LnNjVCOmN9G7KRlv5egB0f9Kp44MiJnsn 8e7F1ezpX/bxXe3B642qvB+IwvV7t8bSDTnpzz6MNEemESp1VObDlkB2bbwHF1GsJHPT2W xJkTT48O70k147R3SA4AvKMhBvgLPi/3IIxxLMwMkV0IfKpPHLGAivJuI5JHoKlp+7vnvF p4S+gVY9APmArCPD6mWKH3PJCsKDBf6Ejiy3ZK05NakOGgTw9vjcakFTUwGprA== ARC-Authentication-Results: i=1; ORIGINATING; auth=pass smtp.auth=hlinnaka smtp.mailfrom=hlinnaka@iki.fi ARC-Seal: i=1; a=rsa-sha256; d=iki.fi; s=lahtoruutu; cv=none; t=1763733395; b=fLRiUPy7KEoooS61xyKdc05CEIJLui2NDqt3so2XeShSSTHYOPK1vtebn/YDP1hR+PJmvG fZ6urtHLhnhqMrVGcBxcpD7dbn+NqMSYYJ9IG/ldouppgZtOrYP+JKoijrbGXMik80AmXc U0m8cIH5b8x3kjimvQKYJdzNyYaiy9F8DERojOcgehsiX64y/EYSSWcurkO6FF+vSLh5d4 ZMs+GWHQXt1zKxetbb/KNAcS4crzlDcrMcijT+0pl3i0tUeSK+OH0r/nx+EQgXrTTGaSvS wlczUybdSyHiGTFaLs7zoPIFfHjgiAzUYwxwP0kFm/AXCrGQ2emcyMsCovQ+pg== Message-ID: <6c298bc4-7029-4c1d-bf16-3e094842ce32@iki.fi> Date: Fri, 21 Nov 2025 15:56:33 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: POC: make mxidoff 64 bits To: Ashutosh Bapat Cc: Maxim Orlov , Alvaro Herrera , Alexander Korotkov , wenhui qiu , Postgres hackers References: <4535f3aa-3220-4760-b1f5-2bc91f248e03@iki.fi> <2bc58592-9d74-4af0-bdd1-1a88e8683f7c@iki.fi> <36531c0e-292c-409d-bbc7-a252cf6e910a@iki.fi> <54aa8f65-f0e4-4464-b543-e0399c1cab1e@iki.fi> <4a9dda70-0af7-41a4-9636-b168f2fc48ef@iki.fi> <46cc45e9-fddd-44bc-bcb3-96889aafd921@iki.fi> Content-Language: en-US From: Heikki Linnakangas In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On 21/11/2025 14:15, Ashutosh Bapat wrote: > On Mon, Nov 17, 2025 at 10:05=E2=80=AFPM Heikki Linnakangas wrote: >> >> Ashutosh, you were interested in reviewing this earlier. Would you hav= e >> a chance to review this now, before I commit it? >=20 > 0002 seems to be fine. It's just moving code from one file to another. > However, the name multixact_internals.h seems to be misusing term > "internal". I would expect an "internal.h" to expose things for use > within the module and not "externally" in other modules. But this > isn't the first time, buf_internal.h, toast_internal.h > bgworker_internal.h and so on have their own meaning of what > "internal" means to them. But it might be better to use something more > meaningful than "internal" in this case. The file seems to contain > declarations related to how pg_multixact SLRU is structured. To that > effect multixact_slru.h or multixact_layout.h may be more appropriate. Yeah, I went with multixact_internal.h because of the precedence. It's=20 not great, but IMHO it's better to be consistent than invent a new=20 naming scheme. > There are two offsets that that file deals with offset within > pg_multixact/offset, MultiXactOffset and member offset (flag offset > and xid offset) within pg_multixact/members. It's quite easy to get > confused between those when reading that code. Agreed, those are confusing. I'll think about that a little more. > The reason why this eliminates the need for wraparound is mentioned > somewhere in GetNewMultiXactId(), but probably it should be mentioned > at a more prominent place and also in the commit message. I expected > it to be in the prologue of GetNewMultiXactId(), and then a reference > to prologue from where the comment is right now. >=20 > ereport(ERROR, > (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED), > errmsg("MultiXact members would wrap around"))); >=20 > If a server ever reaches this point, there is no way but to create > another cluster, if the applications requires multi-xact ids? Pretty much. You can also vacuum freeze everything, so that no multixids=20 are in use, and then use pg_resetwal to reset nextOffset to a lower value= . That obviously sounds bad, but this is a "can't happen" situation. For=20 comparison, we don't provide any better way to recover from running out=20 of LSNs either. > We should also provide this as an errhint(). I think no. You cannot run into this "organically" by just running the=20 system. If you do manage to hit it, it's a sign of some other trouble,=20 and I don't want to guess what that might be, or what might be the=20 appropriate way to recover. > In ExtendMultiXactMember() the last comment mentions a wraparound > /* > * Advance to next page, taking care to properly handle the wraparound > * case. OK if nmembers goes negative. > */ > I thought this wraparound is about offset wraparound, which can not > happen now. Since you have left the comment intact, it's something > else. Is the wraparound of offset within the page? Maybe requires a > bit more clarification? It was indeed about offset wraparound. I'll remove it. > PerformMembersTruncation(MultiXactOffset oldestOffset, MultiXactOffset > newOldestOffset) > { > ... snip ... > - segment +=3D 1; > - } > + SimpleLruTruncate(MultiXactMemberCtl, > + MXOffsetToMemberPage(newOldestOffset)); > } >=20 > Most of the callers of SimpleLruTruncate() call it directly. Why do we > want to keep this static wrapper? PerformOffsetsTruncation() has a > comment which seems to need the wrapper. But > PerformMembersTruncation() doesn't even have that. Hmm, yeah those wrappers are a bit vestigial now. I'm inclined to keep=20 them, because as you said, PerformOffsetsTruncation() provides a place=20 for the comment. And given that, it seems best to keep=20 PerformMembersTruncation(), for symmetry. > MultiXactState->oldestMultiXactId =3D newOldestMulti; > MultiXactState->oldestMultiXactDB =3D newOldestMultiDB; > + MultiXactState->oldestOffset =3D newOldestOffset; > LWLockRelease(MultiXactGenLock); >=20 > Is this something we are missing in the current code? I can not > understand the connection between this change and the fact that offset > wraparound is not possible with wider multi xact offsets. Maybe I > missed some previous discussion. Good question. At first I intended to extract that to a separate commit,=20 before the main patch, because it seems like a nice improvement: We have=20 just calculated 'oldestOffset', so why not update the value in shared=20 memory while we have it? But looking closer, I'm not sure if it'd be=20 sane without the 64-bit offsets. Currently, oldestOffset is only updated=20 by SetOffsetVacuumLimit(), which also updates offsetStopLimit. We could=20 get into a state where oldestOffset is set, but offsetStopLimit is not.=20 With 64-bit offsets that's no longer a concern because it removes=20 offsetStopLimit altogether. > I have reviewed patch 0002 and multxact.c changes in 0003. So far I > have only these comments. I will review the pg_upgrade.c changes next. Thanks for the review so far! - Heikki