Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1taAWS-00GvTT-An for pgsql-pkg-debian@arkaria.postgresql.org; Tue, 21 Jan 2025 09:26:12 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1taAWR-00DiJs-EQ for pgsql-pkg-debian@arkaria.postgresql.org; Tue, 21 Jan 2025 09:26:11 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1taAWR-00DiJj-57 for pgsql-pkg-debian@lists.postgresql.org; Tue, 21 Jan 2025 09:26:11 +0000 Received: from mailfr00.databene.net ([31.170.12.19]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1taAWN-000gxv-2T for pgsql-pkg-debian@lists.postgresql.org; Tue, 21 Jan 2025 09:26:10 +0000 Received: from localhost (localhost.localdomain [127.0.0.1]) by mailfr00.databene.net (Postfix) with ESMTP id 3D5305F2CC; Tue, 21 Jan 2025 10:26:07 +0100 (CET) X-Virus-Scanned: by Amavis at mailfr00.databene.net Received: from mailfr00.databene.net ([127.0.0.1]) by localhost (databene-mailfr00.evolix.net [127.0.0.1]) (amavis, port 10024) with LMTP id tIJ27ACpO_Yf; Tue, 21 Jan 2025 10:26:03 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=data-bene.io; s=2024; t=1737451563; bh=DltateTef56cUZ1wwBJ25fw5PFesgq0KX5tSelRQefA=; h=Date:Subject:To:Cc:References:From:In-Reply-To:From; b=fEIL8OQlCmTVG9jhCQFP2H6byKhz2HPH5P0keQYRXGWkZrqtlcx8W8k78DORWjl4q zY0jFEJ3xUfoPbTD0ZorOY5/V6UKagd406Rdqt9W6j+QZ6/vhWctmgV9P4lCjZVTIP 2Fl2apZMOKOx+OSoVMd/O4YpMFiB2k4qeKPDHm5x2ZwNuncOBh/ZTBbfoxt+e2LQpb OKU9o6gs1Ab3fHch5bu2oRElVCrXE8waoq16vnBY1LiKk4n+qouhU0Y+AHxbRXzYot ectvF7pEZefdso1ccDuIv6giYAkVeKvzpVnehE0y2eSh+JSyrV49fnXmMNbI2lCXyV /uUR8Y3kiTJtcr2ySftkuHSUuOrdF8kdy5sSxlFhpjPC1ZOhbKs0ZixUNmGMK3NWk+ 3mF8ksCn6jcslu5ws5Yzto6Sz6QavMmI8oRj9JfaEYe2Q2wuMToUtDmZpIqltwgD5O jEDOXade8JDGppzmD3iyV+rbDJnvb+FeL2k9JtifeBpivkWdh7gzhRzcw05DVBe1u3 wN7qNvPDUhNJM7byHhbwVfb+N+EQb/php5d/po9YCAVhOH/+uZiMbM3j/wJQQVLTVB PcT7EEL+blqnGKP9cMUV7BLM9ndOmIgNTO0lAteX+6smXKXx84HJ6IilC+TAmM19np miQ8EHo5zXb16FbV62IXFZoo= Received: from [192.168.10.108] (257400053.box.freepro.com [95.178.90.96]) by mailfr00.databene.net (Postfix) with ESMTPSA id F242E5F28D; Tue, 21 Jan 2025 10:26:02 +0100 (CET) Content-Type: multipart/alternative; boundary="------------6fkfPjf6cmgJVVpOX8qSNkPM" Message-ID: <964e72f1-89aa-4512-9d7a-ec722165d291@Data-Bene.io> Date: Tue, 21 Jan 2025 10:26:00 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: deb package sizes To: Magnus Hagander , =?UTF-8?B?w4FsdmFybyBIZXJuw6Fu?= =?UTF-8?Q?dez?= Cc: Jeremy Schneider , Christoph Berg , pgsql-pkg-debian@lists.postgresql.org References: <20250109005301.3b145092@jeremy-ThinkPad-T430s> <20250109090845.24e1df6e@jeremy-ThinkPad-T430s> Content-Language: en-US From: =?UTF-8?Q?C=C3=A9dric_Villemain?= Organization: Data Bene In-Reply-To: List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk This is a multi-part message in MIME format. --------------6fkfPjf6cmgJVVpOX8qSNkPM Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit On 10/01/2025 10:52, Magnus Hagander wrote: > On Thu, Jan 9, 2025 at 11:40 PM Álvaro Hernández wrote: > > > > On 9/1/25 18:08, Jeremy Schneider wrote: >> On Thu, 9 Jan 2025 17:06:57 +0100 >> Álvaro Hernández wrote: >> >>> On 9/1/25 10:07, Christoph Berg wrote: >>>> Re: Jeremy Schneider >>>>> I'm wondering if there might be any support for providing a >>>>> "postgresql-slim" package on PGDG which excludes llvm and python? I >>>>> think this might almost cut the total install size in half, and I >>>>> think there might be many users who would value having the option. >>>>> >>>> Hi, >>>> >>>> could you explain why 250 MB is too much? Disk space these days is >>>> ultra cheap >>>     Hi Christoph. >>> >>>     Container images allow (are meant to) contain only the necessary >>> files needed to run the process that will be run when the image is >>> run. As such, any additional file poses two main problems: >>> >>> * Disk space is cheap. Bandwidth not so much. Time to start a >>> >>> * Security analysis. Unneeded files (specially binaries, but not >> Another concern is the impact of image rebuilds as dependencies are >> updated. Tianon (a primary maintainer of the docker images) has noted >> that they limit frequency of the debian base containers, because every >> rebuild of the base container triggers an avalance of downstream >> rebuilds. CNPG was doing daily rebuilds for awhile, and every time any >> python dependency was updated you'd get a new image - boto3 was >> notorious for very frequent updates. So with a different image version >> for every day, a single server running multiple copies of postgres might >> easily end up with multiple image versions on the server as copies are >> slowly updated. > >     I see this as a symptom of a different, bigger issue: that > package versions, and all transitive dependencies, should be > version pinned when building container images. I haven't seen too > many examples of taking the effort to do this. But it's the only > way to have a way to re-run building images and guarantee outputs > that are reproducible. Once you have this in place, you can decide > how and when you upgrade which versions. > > > I'm guessing most container builders are just not interested in doing > that much work. It's easier to just "always upgrade", but as noted > that comes with a whole different set of problems. It's only really > feasible if you manage to first reduce the set of dependencies > substantially. > > >     Actually, even version pinning is not enough, unless the > package system guarantees that a version of a package is strictly > immutable (and AFAIK this is usually not the case). So digest > pinning is essentially required. > > > Debian (as this was talking about it) is actually doing a very good > job ot that these days, though they're not there all the way. But > https://tests.reproducible-builds.org/debian/reproducible.htmlshows > they're doing really well. Also on debian.net : https://amd64.reproduce.debian.net/#postgresql-17 for "non fancy" webpage. There was a talk on this very topic, at minidebconf recently (by kpcyrd): https://toulouse2024.mini.debconf.org/talks/4-reproducible-builds-rebuilding-what-is-distributed-from-ftpdebianorg/ "Since about a month we’ve also been rebuilding trying to exactly match the builds being distributed via ftp.d.o - this talk will describe the setup and the lessons learned so far, and why the results currently are what they are (spoiler: less <30% reproducible) and what we can do to fix that." And rebuilderd is surely of interest for people willing to work on reproducible builds: https://github.com/kpcyrd/rebuilderd --- Cédric Villemain +33 6 20 30 22 52 https://www.Data-Bene.io PostgreSQL Support, Expertise, Training, R&D --------------6fkfPjf6cmgJVVpOX8qSNkPM Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 8bit


On 10/01/2025 10:52, Magnus Hagander wrote:
On Thu, Jan 9, 2025 at 11:40 PM Álvaro Hernández <aht@ongres.com> wrote:


On 9/1/25 18:08, Jeremy Schneider wrote:
On Thu, 9 Jan 2025 17:06:57 +0100
Álvaro Hernández <aht@ongres.com> wrote:

On 9/1/25 10:07, Christoph Berg wrote:
Re: Jeremy Schneider  
I'm wondering if there might be any support for providing a
"postgresql-slim" package on PGDG which excludes llvm and python? I
think this might almost cut the total install size in half, and I
think there might be many users who would value having the option.
 
Hi,

could you explain why 250 MB is too much? Disk space these days is
ultra cheap  
     Hi Christoph.

     Container images allow (are meant to) contain only the necessary 
files needed to run the process that will be run when the image is
run. As such, any additional file poses two main problems:

* Disk space is cheap. Bandwidth not so much. Time to start a

* Security analysis. Unneeded files (specially binaries, but not
Another concern is the impact of image rebuilds as dependencies are
updated. Tianon (a primary maintainer of the docker images) has noted
that they limit frequency of the debian base containers, because every
rebuild of the base container triggers an avalance of downstream
rebuilds. CNPG was doing daily rebuilds for awhile, and every time any
python dependency was updated you'd get a new image - boto3 was
notorious for very frequent updates. So with a different image version
for every day, a single server running multiple copies of postgres might
easily end up with multiple image versions on the server as copies are
slowly updated.

    I see this as a symptom of a different, bigger issue: that package versions, and all transitive dependencies, should be version pinned when building container images. I haven't seen too many examples of taking the effort to do this. But it's the only way to have a way to re-run building images and guarantee outputs that are reproducible. Once you have this in place, you can decide how and when you upgrade which versions.

I'm guessing most container builders are just not interested in doing that much work. It's easier to just "always upgrade", but as noted that comes with a whole different set of problems. It's only really feasible if you manage to first reduce the set of dependencies substantially.

 

    Actually, even version pinning is not enough, unless the package system guarantees that a version of a package is strictly immutable (and AFAIK this is usually not the case). So digest pinning is essentially required.

Debian (as this was talking about it) is actually doing a very good job ot that these days, though they're not there all the way. But https://tests.reproducible-builds.org/debian/reproducible.htmlshows they're doing really well.


Also on debian.net : https://amd64.reproduce.debian.net/#postgresql-17 for "non fancy" webpage.


There was a talk on this very topic, at minidebconf recently (by kpcyrd):

 https://toulouse2024.mini.debconf.org/talks/4-reproducible-builds-rebuilding-what-is-distributed-from-ftpdebianorg/

"Since about a month we’ve also been rebuilding trying to exactly match the builds being distributed via ftp.d.o - this talk will describe the setup and the lessons learned so far, and why the results currently are what they are (spoiler: less <30% reproducible) and what we can do to fix that."

And rebuilderd is surely of interest for people willing to work on reproducible builds: https://github.com/kpcyrd/rebuilderd
 

---
Cédric Villemain +33 6 20 30 22 52
https://www.Data-Bene.io
PostgreSQL Support, Expertise, Training, R&D
--------------6fkfPjf6cmgJVVpOX8qSNkPM--