Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wF5Po-004jOW-2h for pgsql-hackers@arkaria.postgresql.org; Tue, 21 Apr 2026 07:21:01 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wF5Po-007DYp-00 for pgsql-hackers@arkaria.postgresql.org; Tue, 21 Apr 2026 07:21:00 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wF5Pn-007DYg-25 for pgsql-hackers@lists.postgresql.org; Tue, 21 Apr 2026 07:20:59 +0000 Received: from meesny.iki.fi ([2001:67c:2b0:1c1::201]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1wF5Pk-0000000233g-2Sb8 for pgsql-hackers@postgresql.org; Tue, 21 Apr 2026 07:20:58 +0000 Received: from [10.0.2.15] (unknown [137.83.235.84]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange x25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: hlinnaka) by meesny.iki.fi (Postfix) with ESMTPSA id 4g0DKc3kQszyQH; Tue, 21 Apr 2026 10:20:52 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=iki.fi; s=meesny; t=1776756053; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=20mHdTp0BuDx4TBQ9OgCslN82s16H1n/tIqIaaqx2ps=; b=oXUnqGqBohJkPs5VWTvOwlOKSkhsPe3eW/YrEmUZSSFfJGln4oq87PLlV0G2EAkFGaYc+K 9u7f72sH7JZ4vr30zwebWsYmHHBpJ5EBzHbLM7psai8r60I3/uNTBYE4RhCcC+sg93XaQz I7ScANqSQp8GFNmcew1DpCSEFxiLo64= ARC-Seal: i=1; a=rsa-sha256; d=iki.fi; s=meesny; cv=none; t=1776756053; b=h7CNJ/vK42zMnDpMC9VM4dD5D7uceo8CqLZIWMWG7wlNwq5oSdlALPV9bL8Mk6SyZ5cJzY YEKiTB3dybYxENQb7CLNrJKTNJNY4k+pAXjk79veGB5Ca1+fQ3EC9NE6e8UWV3YsySJhwh wrizj0CuZKLe98gxOthVoX2g3LjtmI4= ARC-Authentication-Results: i=1; ORIGINATING; auth=pass smtp.auth=hlinnaka smtp.mailfrom=hlinnaka@iki.fi ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=iki.fi; s=meesny; t=1776756053; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=20mHdTp0BuDx4TBQ9OgCslN82s16H1n/tIqIaaqx2ps=; b=go8C+10esQbwvfkQMQmv4jPrLLrHW7/AQ/JNBw4AjwHHNJWI5OZhXo0Sv1/JRpQB9qz6wO C03IX1gxnvwZOZZghA/YbMenQgXN3XZbX0BQdqeeIAukd72U3hXELNMnfi8xdm60WIK7X9 NEEo3b0lWdi6RAFl1EKh+DXXa8owElQ= Message-ID: Date: Tue, 21 Apr 2026 10:20:46 +0300 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: Compress prune/freeze records with Delta Frame of Reference algorithm To: Evgeny Voropaev , Andres Freund , PostgreSQL Hackers , Andrey Borodin References: <9d79f4b9-1a1c-4177-bdfa-3df9a5171db9@iki.fi> <380a9035-54e3-4ee2-bdde-fe7526c34b6a@tantorlabs.com> Content-Language: en-US From: Heikki Linnakangas In-Reply-To: <380a9035-54e3-4ee2-bdde-fe7526c34b6a@tantorlabs.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On 21/04/2026 08:41, Evgeny Voropaev wrote: > Hello hackers, > >> Can this DFoR code replace integerset.c easily? Can we use it for >> the vacuum dead TID list? For GIN posting lists? Where else? > > Heikki, thank you for your attention and proposals. I'm learning areas > you proposed to be developed. This took time, since I am not adept at > them. Last week I also have been developing the DFoR patch to support > unsorted sequences. That's why there was the delay in answering. > > About GIN. > Since GIN exploits TIDs sequences and saves it on the disk, it can be > the most appropriate candidate to be developed with DFoR. +1. And maybe the tid lists in to B-tree tuples too while we're at it. For GIN posting lists, one important property of the current compression scheme is that removing TIDs never makes the list larger than the original. That's important for VACUUM, see https://github.com/postgres/postgres/blob/d3bba041543593eb5341683107d899734dc8e73e/src/backend/access/gin/ginpostinglist.c#L55 > About the dead TID list. > If I'm not mistaken, the dead TID list exists only in RAM and never on > the disk or in the network. So, what is the advantage supposed to be > achieved due to using compression in the dead TID list? Reduces memory usage. And if it's faster to lookup than the current data structure, that too. I don't know if that works out. > About the GiST vacuuming and the use of integerset in it. > The integerset implements a tree in addition to compression. > DFoR now performs only compression. Moreover the size of a pack is > flexible (varying), which must become an issue for its usage in the > tree. It needs more thorough further elaboration to be developed. Hmm. The integerset is a sparse list of integers, just like Frame of Reference. The tree inside it is just an implementation detail. I was thinking that you could replace the whole tree with DFoR, but I suppose you cannot do random lookups in a DFoR compressed list, so you'd still need the tree. - Heikki