Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vQlas-00HOlT-35 for pgsql-hackers@arkaria.postgresql.org; Wed, 03 Dec 2025 12:04:27 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vQlar-00D9Bz-38 for pgsql-hackers@arkaria.postgresql.org; Wed, 03 Dec 2025 12:04:26 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vQlar-00D9Br-21 for pgsql-hackers@lists.postgresql.org; Wed, 03 Dec 2025 12:04:25 +0000 Received: from lahtoruutu.iki.fi ([2a0b:5c81:1c1::37]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vQlaq-002uhM-0K for pgsql-hackers@lists.postgresql.org; Wed, 03 Dec 2025 12:04:25 +0000 Received: from [10.0.2.15] (unknown [130.41.208.2]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange x25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: hlinnaka) by lahtoruutu.iki.fi (Postfix) with ESMTPSA id 4dLxBh4y4Lz49Q2Z; Wed, 03 Dec 2025 14:04:12 +0200 (EET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=iki.fi; s=lahtoruutu; t=1764763453; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Yt17rDjIVVsvLLJmuCOA3oHYZislJkaOKzPB9GK6yOY=; b=Cweb3ymqGcAm+ATxSamwChtCJlflTbi7lcDNKisMYJ+8FPsHDQCZiqM/634JgQ29Q+9b85 LZ7XvwjYpTCjoY1McH347x25xZjGfHEvS89qoe8lKTk3YccIcRlp7mu0HYkt1jZzdlGj7m SkWKnv5hWPNlsB7RyYuIO2ZYSaOZeseWIX7JJ4wubk/qGdVh9GrngZwxtLNGVvdeLvms7y PpgzZ8oiMHa0MUydsOqkun/NhawG7MNFwf/HMgyT3q2/2j1BKQNrMX7F9pTuwoyj0x8dqe OxooI3RsxYu9Tx9tOJVS3Z20N6lUpLGoQQGVSq905La613eHpnIlvSFhe1M/Pg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=iki.fi; s=lahtoruutu; t=1764763453; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Yt17rDjIVVsvLLJmuCOA3oHYZislJkaOKzPB9GK6yOY=; b=rfMYWAt02uCso5z8m/HRGfhoeNICkKCM5Am8smXShrnZ7eBZaKxWG+jHB4ZXahDqbr3G6Q 7S82AVW2LJ7cCsOhnWglJkNEjQ3S5XpVkZO7RBCOjhl3nJO4xQE1Dfm6fF7NylFk0ZsqSu Hk1nRfuEicSDYPl9i701yIENMVjSZ5oQAJdWYNP2V3c4SLv+BV5fqNF1/ULF6k0IZy3Hm2 j1upPJimeP4MreHaKC/VKNFpuwIE+nRdIP0JZAeRtOrIKgYtipRQ4DVNEq0IbE6Z18x5N0 5RnuMM7CZEdsuUk9Yp9T6PTnYDf3g6ouYDNjk4tlbOGpO9CPpLmRp4X0AvfKMg== ARC-Seal: i=1; a=rsa-sha256; d=iki.fi; s=lahtoruutu; cv=none; t=1764763453; b=TvwiRBwQVZqchWTACpbhT/tUVsYk1h3l5cIr24iy/hdORAoj1+fvc76i9KOP62CcdOlARm ClQSrMyuEMNnAtrzvxuqlFn3eN4QddcBddVCkLIqNu+rwCsGraisyN8GC2D5s8TENqAnMq 8q57NctzNEb/hhjzuzNrU+p6dgi105ehqQZRJZS1ISi8CfkPLWnskLBZQ4n/itgFofePHo MCoG+qrA3r+ixO9JL/H+RvK5/zss4cUaU5y0K9HuEGCE5iPuz62W6eO2zmE2FLGSFZL4XT Kn+JqOMr2Gk9yUTnu6+Vu3F7HcFRU94EyDkR6Fw/bxXnuQCUBWFXkj1irGZDTQ== ARC-Authentication-Results: i=1; ORIGINATING; auth=pass smtp.auth=hlinnaka smtp.mailfrom=hlinnaka@iki.fi Message-ID: <5c4b2864-3baf-4ebc-8405-280fc1a1c8a9@iki.fi> Date: Wed, 3 Dec 2025 14:04:11 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: POC: make mxidoff 64 bits To: Maxim Orlov Cc: Alexander Korotkov , Alvaro Herrera , wenhui qiu , Postgres hackers , Ashutosh Bapat References: <36531c0e-292c-409d-bbc7-a252cf6e910a@iki.fi> <54aa8f65-f0e4-4464-b543-e0399c1cab1e@iki.fi> <4a9dda70-0af7-41a4-9636-b168f2fc48ef@iki.fi> <46cc45e9-fddd-44bc-bcb3-96889aafd921@iki.fi> <6c298bc4-7029-4c1d-bf16-3e094842ce32@iki.fi> <9ee6324a-44fc-42fb-bf8e-7c3b53395588@iki.fi> <52227f05-51aa-40c4-8f83-9c79fff16175@iki.fi> Content-Language: en-US From: Heikki Linnakangas In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On 03/12/2025 11:54, Maxim Orlov wrote: > The biggest problem with compression, in my opinion, is that losing > even one byte causes the loss of the entire compressed block in the > worst case scenario. After all, we still don't have checksums for the > SLRU's, which is a shame by itself. > > Again, I'm not against the idea of compression, but the risks need to > be considered. There are plenty of such critical bytes in the system where a single bit flip renders the whole block unreadable. Actually, if we had checksums on SLRU pages, a single bit flip anywhere in the page would make the checksum fail and render the block unreadable. If things go really bad and you need to open a hex editor and try to fix the data manually, it shouldn't be too hard to deduce the correct base offset from surrounding data. > As a software developer, I definitely want to implement compression and > save a few gigabytes. However, given my previous experience using > Postgres in real-world applications, reliability at the cost of several > gigabytes would not have caused me any trouble. Just saying. +1. If we decide to do some kind of compression here, I want it to be very simple. Otherwise it's just not worth the code complexity and risk. Let's do the math of how much disk space we'd save. Let's assume the worst case that every multixid consists of only one transaction ID. Currently, every such multixid takes up 4 bytes in the offsets SLRU, and 5 bytes in the members SLRU (one flag byte and 4 bytes for the XID). So that's 9 bytes. With 64-bit offsets, it becomes 13 bytes. With the compression, we're back to 9 bytes again (ignoring the one base offset per page). So in an extreme case that you have 1 billion multixids, with only one XID per multixid, the difference is between 9 GB and 13 GB. That seems acceptable. And having just one XID per multixid is a rare corner case. Much more commonly, you have at at least two XIDs. With two XIDs per multixid, the difference is between 14 bytes and 18 bytes. And having a billion multixids is pretty extreme. Your database is likely very large too if you reach that point, and a few gigabytes won't matter. One could argue that the memory needed for the SLRU cache matters more than the disk space. That's perhaps true, but I think this is totally acceptable from that point of view, too. - Heikki