public inbox for [email protected]  
help / color / mirror / Atom feed
From: Cédric Villemain <[email protected]>
To: Magnus Hagander <[email protected]>
To: Álvaro Hernández <[email protected]>
Cc: Jeremy Schneider <[email protected]>
Cc: Christoph Berg <[email protected]>
Cc: [email protected]
Subject: Re: deb package sizes
Date: Tue, 21 Jan 2025 10:26:00 +0100
Message-ID: <[email protected]> (raw)
In-Reply-To: <CABUevEzt5sF7tcn6=WVW8PTCfrf9qzmHicxN3mt37Th1kwfZHQ@mail.gmail.com>
References: <20250109005301.3b145092@jeremy-ThinkPad-T430s>
	<[email protected]>
	<[email protected]>
	<20250109090845.24e1df6e@jeremy-ThinkPad-T430s>
	<[email protected]>
	<CABUevEzt5sF7tcn6=WVW8PTCfrf9qzmHicxN3mt37Th1kwfZHQ@mail.gmail.com>


On 10/01/2025 10:52, Magnus Hagander wrote:
> On Thu, Jan 9, 2025 at 11:40 PM Álvaro Hernández <[email protected]> wrote:
>
>
>
>     On 9/1/25 18:08, Jeremy Schneider wrote:
>>     On Thu, 9 Jan 2025 17:06:57 +0100
>>     Álvaro Hernández<[email protected]> <mailto:[email protected]> wrote:
>>
>>>     On 9/1/25 10:07, Christoph Berg wrote:
>>>>     Re: Jeremy Schneider
>>>>>     I'm wondering if there might be any support for providing a
>>>>>     "postgresql-slim" package on PGDG which excludes llvm and python? I
>>>>>     think this might almost cut the total install size in half, and I
>>>>>     think there might be many users who would value having the option.
>>>>>       
>>>>     Hi,
>>>>
>>>>     could you explain why 250 MB is too much? Disk space these days is
>>>>     ultra cheap
>>>           Hi Christoph.
>>>
>>>           Container images allow (are meant to) contain only the necessary
>>>     files needed to run the process that will be run when the image is
>>>     run. As such, any additional file poses two main problems:
>>>
>>>     * Disk space is cheap. Bandwidth not so much. Time to start a
>>>
>>>     * Security analysis. Unneeded files (specially binaries, but not
>>     Another concern is the impact of image rebuilds as dependencies are
>>     updated. Tianon (a primary maintainer of the docker images) has noted
>>     that they limit frequency of the debian base containers, because every
>>     rebuild of the base container triggers an avalance of downstream
>>     rebuilds. CNPG was doing daily rebuilds for awhile, and every time any
>>     python dependency was updated you'd get a new image - boto3 was
>>     notorious for very frequent updates. So with a different image version
>>     for every day, a single server running multiple copies of postgres might
>>     easily end up with multiple image versions on the server as copies are
>>     slowly updated.
>
>         I see this as a symptom of a different, bigger issue: that
>     package versions, and all transitive dependencies, should be
>     version pinned when building container images. I haven't seen too
>     many examples of taking the effort to do this. But it's the only
>     way to have a way to re-run building images and guarantee outputs
>     that are reproducible. Once you have this in place, you can decide
>     how and when you upgrade which versions.
>
>
> I'm guessing most container builders are just not interested in doing 
> that much work. It's easier to just "always upgrade", but as noted 
> that comes with a whole different set of problems. It's only really 
> feasible if you manage to first reduce the set of dependencies 
> substantially.
>
>
>         Actually, even version pinning is not enough, unless the
>     package system guarantees that a version of a package is strictly
>     immutable (and AFAIK this is usually not the case). So digest
>     pinning is essentially required.
>
>
> Debian (as this was talking about it) is actually doing a very good 
> job ot that these days, though they're not there all the way. But 
> https://tests.reproducible-builds.org/debian/reproducible.htmlshows 
> they're doing really well.


Also on debian.net : https://amd64.reproduce.debian.net/#postgresql-17 
for "non fancy" webpage.


There was a talk on this very topic, at minidebconf recently (by kpcyrd):

https://toulouse2024.mini.debconf.org/talks/4-reproducible-builds-rebuilding-what-is-distributed-fro...

"Since about a month we’ve also been rebuilding trying to exactly match 
the builds being distributed via ftp.d.o - this talk will describe the 
setup and the lessons learned so far, and why the results currently are 
what they are (spoiler: less <30% reproducible) and what we can do to 
fix that."

And rebuilderd is surely of interest for people willing to work on 
reproducible builds: https://github.com/kpcyrd/rebuilderd

  

---
Cédric Villemain +33 6 20 30 22 52
https://www.Data-Bene.io
PostgreSQL Support, Expertise, Training, R&D


view thread (10+ messages)

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: deb package sizes
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox