public inbox for [email protected]
help / color / mirror / Atom feeddeb package sizes
10+ messages / 5 participants
[nested] [flat]
* deb package sizes
@ 2025-01-09 08:53 Jeremy Schneider <[email protected]>
2025-01-09 09:07 ` Re: deb package sizes Christoph Berg <[email protected]>
2025-01-09 15:43 ` Re: deb package sizes Álvaro Hernández <[email protected]>
0 siblings, 2 replies; 10+ messages in thread
From: Jeremy Schneider @ 2025-01-09 08:53 UTC (permalink / raw)
To: [email protected]
Hello, I hope I found a good mailing list for this topic?
Recently, I've been spending some time looking at the official Postgres
docker images. https://hub.docker.com/_/postgres/
I think there are a lot of people using these to quickly spin up a
postgres database for testing on their local dev machine. Right now,
they are also the base image used for building CloudNativePG production
postgres images.
These official docker images are a repackaging of the PGDG debian
packages, combined with a minimal set of debian OS packages. Docker
images are built using both debian stable and debian oldstable branches
(with tags like "17.1-bookworm" and "17.1-bullseye").
With docker images, we like to get the container images to be as
minimal and small as possible. I have spent a little time looking at the
make-up of the official docker images from a size perspective, which is
driven by debian package sizes.
Before adding any PGDG postgres packages or dependencies, our base OS
container image is 74MB and includes about 88 debian packages.
We install only 5 PGDG postgres packages: postgresql,
postgresql-client, postgresql-client-common, postgresql-common and
libpq5. The "common" packages are tiny, libpq is 1MB, client is 10MB
and the postgresql package itself is 60MB.
What's more interesting is all of the additional dependencies that the
postgresql package pulls in: an extra 53 debian packages that are over
250MB in total size.
The biggest size contributors are libllvm & libz3 (143MB), libperl &
perl-modules (45MB total) and libicu (36MB). These three things alone
make up 64% of the total postgres-specific bytes.
I'm wondering if there might be any support for providing a
"postgresql-slim" package on PGDG which excludes llvm and python? I
think this might almost cut the total install size in half, and I think
there might be many users who would value having the option.
Even though ICU is a larger package, I would argue for still
including it in a "slim" build. Because of the drama around glibc
collation I view ICU as especially important to make available.
Interested to know others' thoughts about having a slimmer package.
Thanks,
Jeremy Schneider
PS. here are the commands I used to get the sizes (apologies that the
formatting isn't great) and the full list of postgresql-specific
packages
docker run --rm debian:bookworm-slim dpkg-query --show
--showformat='${Package}\t${Installed-Size} KB\n' > base-pkgs
docker run --rm postgres:17-bookworm dpkg-query --show
--showformat='${Package}\t${Installed-Size} KB\n' > pg-pkgs
docker run --rm postgres:17-bookworm apt rdepends libz3-4
libz3-4
Reverse Depends:
Depends: libllvm16 (>= 4.8.12)
diff -b base-pkgs pg-pkgs |grep '^>'|sort -k3 -n |
awk '{total+=$3;printf "%-30s %s",$0,
"| running total size: "total
" KB | running total percentage: "total*100/355572"%\n"}'
netbase 36 KB | running total size: 36 KB
| running total percentage: 0.0101245%
libkeyutils1 40 KB | running total size: 76 KB
| running total percentage: 0.021374%
libnpth0 50 KB | running total size: 126 KB
| running total percentage: 0.0354359%
sensible-utils 56 KB | running total size: 182 KB
| running total percentage: 0.0511851%
ssl-cert 64 KB | running total size: 246 KB
| running total percentage: 0.0691843%
libgdbm-compat4 70 KB | running total size: 316 KB
| running total percentage: 0.0888709%
libsasl2-modules-db 77 KB | running total size: 393 KB
| running total percentage: 0.110526%
readline-common 89 KB | running total size: 482 KB
| running total percentage: 0.135556%
libnss-wrapper 99 KB | running total size: 581 KB
| running total percentage: 0.163399%
libio-pty-perl 103 KB | running total size: 684 KB
| running total percentage: 0.192366%
libassuan0 117 KB | running total size: 801 KB
| running total percentage: 0.225271%
libgdbm6 129 KB | running total size: 930 KB
| running total percentage: 0.26155%
libkrb5support0 133 KB | running total size: 1063 KB
| running total percentage: 0.298955%
postgresql-client-common 133 KB | running total size: 1196 KB
| running total percentage: 0.336359%
pinentry-curses 140 KB | running total size: 1336 KB
| running total percentage: 0.375733%
libsasl2-2 167 KB | running total size: 1503 KB
| running total percentage: 0.422699%
libbsd0 202 KB | running total size: 1705 KB
| running total percentage: 0.479509%
ucf 214 KB | running total size: 1919 KB
| running total percentage: 0.539694%
libjson-perl 244 KB | running total size: 2163 KB
| running total percentage: 0.608316%
libedit2 258 KB | running total size: 2421 KB
| running total percentage: 0.680875%
libk5crypto3 260 KB | running total size: 2681 KB
| running total percentage: 0.753996%
libipc-run-perl 267 KB | running total size: 2948 KB
| running total percentage: 0.829087%
less 313 KB | running total size: 3261 KB
| running total percentage: 0.917114%
libksba8 316 KB | running total size: 3577 KB
| running total percentage: 1.00598%
libncursesw6 412 KB | running total size: 3989 KB
| running total percentage: 1.12185%
libgssapi-krb5-2 424 KB | running total size: 4413 KB
| running total percentage: 1.2411%
libreadline8 475 KB | running total size: 4888 KB
| running total percentage: 1.37469%
libxslt1.1 504 KB | running total size: 5392 KB
| running total percentage: 1.51643%
libldap-2.5-0 553 KB | running total size: 5945 KB
| running total percentage: 1.67195%
gpg-wks-server 657 KB | running total size: 6602 KB
| running total percentage: 1.85673%
postgresql-common 667 KB | running total size: 7269 KB
| running total percentage: 2.04431%
perl 669 KB | running total size: 7938 KB
| running total percentage: 2.23246%
gpg-wks-client 682 KB | running total size: 8620 KB
| running total percentage: 2.42426%
gpgconf 803 KB | running total size: 9423 KB
| running total percentage: 2.6501%
gnupg 885 KB | running total size: 10308 KB
| running total percentage: 2.89899%
gpgsm 992 KB | running total size: 11300 KB
| running total percentage: 3.17798%
libpq5 1068 KB | running total size: 12368 KB
| running total percentage: 3.47834%
libkrb5-3 1076 KB | running total size: 13444 KB
| running total percentage: 3.78095%
xz-utils 1226 KB | running total size: 14670 KB
| running total percentage: 4.12575%
dirmngr 1328 KB | running total size: 15998 KB
| running total percentage: 4.49923%
gpg-agent 1348 KB | running total size: 17346 KB
| running total percentage: 4.87834%
gpg 1581 KB | running total size: 18927 KB
| running total percentage: 5.32297%
libsqlite3-0 1682 KB | running total size: 20609 KB
| running total percentage: 5.79601%
gnupg-utils 1836 KB | running total size: 22445 KB
| running total percentage: 6.31236%
libxml2 1866 KB | running total size: 24311 KB
| running total percentage: 6.83715%
zstd 2102 KB | running total size: 26413 KB
| running total percentage: 7.42831%
openssl 2296 KB | running total size: 28709 KB
| running total percentage: 8.07403%
libc-l10n 4348 KB | running total size: 33057 KB
| running total percentage: 9.29685%
gnupg-l10n 4874 KB | running total size: 37931 KB
| running total percentage: 10.6676%
libssl3 6021 KB | running total size: 43952 KB
| running total percentage: 12.3609%
postgresql-client-17 9947 KB | running total size: 53899 KB
| running total percentage: 15.1584%
locales 15845 KB | running total size: 69744 KB
| running total percentage: 19.6146%
perl-modules-5.36 17816 KB | running total size: 87560 KB
| running total percentage: 24.6251%
libz3-4 22767 KB | running total size: 110327 KB
| running total percentage: 31.028%
libperl5.36 28862 KB | running total size: 139189 KB
| running total percentage: 39.1451%
libicu72 36170 KB | running total size: 175359 KB
| running total percentage: 49.3174%
postgresql-17 59671 KB | running total size: 235030 KB
| running total percentage: 66.0991%
libllvm16 120542 KB | running total size: 355572 KB
| running total percentage: 100%
^ permalink raw reply [nested|flat] 10+ messages in thread
* Re: deb package sizes
2025-01-09 08:53 deb package sizes Jeremy Schneider <[email protected]>
@ 2025-01-09 09:07 ` Christoph Berg <[email protected]>
2025-01-09 16:06 ` Re: deb package sizes Álvaro Hernández <[email protected]>
1 sibling, 1 reply; 10+ messages in thread
From: Christoph Berg @ 2025-01-09 09:07 UTC (permalink / raw)
To: [email protected]
Re: Jeremy Schneider
> I'm wondering if there might be any support for providing a
> "postgresql-slim" package on PGDG which excludes llvm and python? I
> think this might almost cut the total install size in half, and I think
> there might be many users who would value having the option.
Hi,
could you explain why 250 MB is too much? Disk space these days is
ultra cheap and removing functionality (query JITing) does have cost
as well.
> Even though ICU is a larger package, I would argue for still
> including it in a "slim" build. Because of the drama around glibc
> collation I view ICU as especially important to make available.
Note that ICU does not fix the collation drama either, you will have
to reindex on ICU upgrades as well.
Christoph
^ permalink raw reply [nested|flat] 10+ messages in thread
* Re: deb package sizes
2025-01-09 08:53 deb package sizes Jeremy Schneider <[email protected]>
2025-01-09 09:07 ` Re: deb package sizes Christoph Berg <[email protected]>
@ 2025-01-09 16:06 ` Álvaro Hernández <[email protected]>
2025-01-09 17:08 ` Re: deb package sizes Jeremy Schneider <[email protected]>
0 siblings, 1 reply; 10+ messages in thread
From: Álvaro Hernández @ 2025-01-09 16:06 UTC (permalink / raw)
To: Christoph Berg <[email protected]>; [email protected]
On 9/1/25 10:07, Christoph Berg wrote:
> Re: Jeremy Schneider
>> I'm wondering if there might be any support for providing a
>> "postgresql-slim" package on PGDG which excludes llvm and python? I
>> think this might almost cut the total install size in half, and I think
>> there might be many users who would value having the option.
> Hi,
>
> could you explain why 250 MB is too much? Disk space these days is
> ultra cheap
Hi Christoph.
Container images allow (are meant to) contain only the necessary
files needed to run the process that will be run when the image is run.
As such, any additional file poses two main problems:
* Disk space is cheap. Bandwidth not so much. Time to start a container
may have a notable cost. Making container images slimmer helps in all
these dimensions. When you run the same container image in many places,
with high frequency, and end up pulling it multiple times, it all that
has a cost. In particular for Postgres, time pulling and running an
image may affect uptime. So it can become quite important.
* Security analysis. Unneeded files (specially binaries, but not only)
may lead to container images having (more) security vulnerabilities than
they could. For many, container images must pass vulnerability analysis
scans, and the more (unneeded) packages present, the bigger the chances
are that they may contain vulnerabilities. It's anyway a basic security
principle, to only contain the files needed to run the files needed, and
no more.
> and removing functionality (query JITing) does have cost
> as well.
If it can be made optional, then users can decide whether they want
container images with this functionality or not.
>> Even though ICU is a larger package, I would argue for still
>> including it in a "slim" build. Because of the drama around glibc
>> collation I view ICU as especially important to make available.
> Note that ICU does not fix the collation drama either, you will have
> to reindex on ICU upgrades as well.
Agreed that it doesn't solve the whole drama, but reindexes are not
needed if container images for upgrades are provided while keeping the
ICU version constant (which is doable).
Álvaro
^ permalink raw reply [nested|flat] 10+ messages in thread
* Re: deb package sizes
2025-01-09 08:53 deb package sizes Jeremy Schneider <[email protected]>
2025-01-09 09:07 ` Re: deb package sizes Christoph Berg <[email protected]>
2025-01-09 16:06 ` Re: deb package sizes Álvaro Hernández <[email protected]>
@ 2025-01-09 17:08 ` Jeremy Schneider <[email protected]>
2025-01-09 22:40 ` Re: deb package sizes Álvaro Hernández <[email protected]>
0 siblings, 1 reply; 10+ messages in thread
From: Jeremy Schneider @ 2025-01-09 17:08 UTC (permalink / raw)
To: Álvaro Hernández <[email protected]>; +Cc: Christoph Berg <[email protected]>; [email protected]
On Thu, 9 Jan 2025 17:06:57 +0100
Álvaro Hernández <[email protected]> wrote:
> On 9/1/25 10:07, Christoph Berg wrote:
> > Re: Jeremy Schneider
> >> I'm wondering if there might be any support for providing a
> >> "postgresql-slim" package on PGDG which excludes llvm and python? I
> >> think this might almost cut the total install size in half, and I
> >> think there might be many users who would value having the option.
> >>
> > Hi,
> >
> > could you explain why 250 MB is too much? Disk space these days is
> > ultra cheap
>
> Hi Christoph.
>
> Container images allow (are meant to) contain only the necessary
> files needed to run the process that will be run when the image is
> run. As such, any additional file poses two main problems:
>
> * Disk space is cheap. Bandwidth not so much. Time to start a
>
> * Security analysis. Unneeded files (specially binaries, but not
Another concern is the impact of image rebuilds as dependencies are
updated. Tianon (a primary maintainer of the docker images) has noted
that they limit frequency of the debian base containers, because every
rebuild of the base container triggers an avalance of downstream
rebuilds. CNPG was doing daily rebuilds for awhile, and every time any
python dependency was updated you'd get a new image - boto3 was
notorious for very frequent updates. So with a different image version
for every day, a single server running multiple copies of postgres might
easily end up with multiple image versions on the server as copies are
slowly updated.
>
> > and removing functionality (query JITing) does have cost
> > as well.
>
> If it can be made optional, then users can decide whether they
> want container images with this functionality or not.
To be clear, I definitely don't want the "default" postgres packages to
not have JIT. I was just suggesting a non-default "slim" alternative.
Honestly I don't know if this is going to introduce a bunch of
complexity in dependency management between debian packages, and how
feasible it would be actually do it... but wanted to ask the question
and raise the topic.
> >> Even though ICU is a larger package, I would argue for still
> >> including it in a "slim" build. Because of the drama around glibc
> >> collation I view ICU as especially important to make available.
> > Note that ICU does not fix the collation drama either, you will have
> > to reindex on ICU upgrades as well.
>
> Agreed that it doesn't solve the whole drama, but reindexes are
> not needed if container images for upgrades are provided while
> keeping the ICU version constant (which is doable).
Yes, I'm definitely well aware of how ICU isn't really changing
anything about rebuild requirement - I've said many times that people
should default to builtin C collation starting with pg17, and set
linguistic collation at a table or query level. The big advantage of
this is that it's much easier to know everything that needs rebuilding,
since postgres does good dependency tracking of objects using nondefault
collation.
But with ICU there is at least the option that someone could rebuild an
old version and run it on the new debian release. That's nearly
impossible with glibc.
-Jeremy
^ permalink raw reply [nested|flat] 10+ messages in thread
* Re: deb package sizes
2025-01-09 08:53 deb package sizes Jeremy Schneider <[email protected]>
2025-01-09 09:07 ` Re: deb package sizes Christoph Berg <[email protected]>
2025-01-09 16:06 ` Re: deb package sizes Álvaro Hernández <[email protected]>
2025-01-09 17:08 ` Re: deb package sizes Jeremy Schneider <[email protected]>
@ 2025-01-09 22:40 ` Álvaro Hernández <[email protected]>
2025-01-10 09:52 ` Re: deb package sizes Magnus Hagander <[email protected]>
0 siblings, 1 reply; 10+ messages in thread
From: Álvaro Hernández @ 2025-01-09 22:40 UTC (permalink / raw)
To: Jeremy Schneider <[email protected]>; +Cc: Christoph Berg <[email protected]>; [email protected]
On 9/1/25 18:08, Jeremy Schneider wrote:
> On Thu, 9 Jan 2025 17:06:57 +0100
> Álvaro Hernández<[email protected]> wrote:
>
>> On 9/1/25 10:07, Christoph Berg wrote:
>>> Re: Jeremy Schneider
>>>> I'm wondering if there might be any support for providing a
>>>> "postgresql-slim" package on PGDG which excludes llvm and python? I
>>>> think this might almost cut the total install size in half, and I
>>>> think there might be many users who would value having the option.
>>>>
>>> Hi,
>>>
>>> could you explain why 250 MB is too much? Disk space these days is
>>> ultra cheap
>> Hi Christoph.
>>
>> Container images allow (are meant to) contain only the necessary
>> files needed to run the process that will be run when the image is
>> run. As such, any additional file poses two main problems:
>>
>> * Disk space is cheap. Bandwidth not so much. Time to start a
>>
>> * Security analysis. Unneeded files (specially binaries, but not
> Another concern is the impact of image rebuilds as dependencies are
> updated. Tianon (a primary maintainer of the docker images) has noted
> that they limit frequency of the debian base containers, because every
> rebuild of the base container triggers an avalance of downstream
> rebuilds. CNPG was doing daily rebuilds for awhile, and every time any
> python dependency was updated you'd get a new image - boto3 was
> notorious for very frequent updates. So with a different image version
> for every day, a single server running multiple copies of postgres might
> easily end up with multiple image versions on the server as copies are
> slowly updated.
I see this as a symptom of a different, bigger issue: that package
versions, and all transitive dependencies, should be version pinned when
building container images. I haven't seen too many examples of taking
the effort to do this. But it's the only way to have a way to re-run
building images and guarantee outputs that are reproducible. Once you
have this in place, you can decide how and when you upgrade which versions.
Actually, even version pinning is not enough, unless the package
system guarantees that a version of a package is strictly immutable (and
AFAIK this is usually not the case). So digest pinning is essentially
required.
> But with ICU there is at least the option that someone could rebuild an
> old version and run it on the new debian release. That's nearly
> impossible with glibc.
>
Exactly, and this is doable.
Álvaro
--
Alvaro Hernandez
-----------
OnGres
^ permalink raw reply [nested|flat] 10+ messages in thread
* Re: deb package sizes
2025-01-09 08:53 deb package sizes Jeremy Schneider <[email protected]>
2025-01-09 09:07 ` Re: deb package sizes Christoph Berg <[email protected]>
2025-01-09 16:06 ` Re: deb package sizes Álvaro Hernández <[email protected]>
2025-01-09 17:08 ` Re: deb package sizes Jeremy Schneider <[email protected]>
2025-01-09 22:40 ` Re: deb package sizes Álvaro Hernández <[email protected]>
@ 2025-01-10 09:52 ` Magnus Hagander <[email protected]>
2025-01-10 11:32 ` Re: deb package sizes Álvaro Hernández <[email protected]>
2025-01-21 09:26 ` Re: deb package sizes Cédric Villemain <[email protected]>
0 siblings, 2 replies; 10+ messages in thread
From: Magnus Hagander @ 2025-01-10 09:52 UTC (permalink / raw)
To: Álvaro Hernández <[email protected]>; +Cc: Jeremy Schneider <[email protected]>; Christoph Berg <[email protected]>; [email protected]
On Thu, Jan 9, 2025 at 11:40 PM Álvaro Hernández <[email protected]> wrote:
>
>
> On 9/1/25 18:08, Jeremy Schneider wrote:
>
> On Thu, 9 Jan 2025 17:06:57 +0100
> Álvaro Hernández <[email protected]> <[email protected]> wrote:
>
>
> On 9/1/25 10:07, Christoph Berg wrote:
>
> Re: Jeremy Schneider
>
> I'm wondering if there might be any support for providing a
> "postgresql-slim" package on PGDG which excludes llvm and python? I
> think this might almost cut the total install size in half, and I
> think there might be many users who would value having the option.
>
>
> Hi,
>
> could you explain why 250 MB is too much? Disk space these days is
> ultra cheap
>
> Hi Christoph.
>
> Container images allow (are meant to) contain only the necessary
> files needed to run the process that will be run when the image is
> run. As such, any additional file poses two main problems:
>
> * Disk space is cheap. Bandwidth not so much. Time to start a
>
> * Security analysis. Unneeded files (specially binaries, but not
>
> Another concern is the impact of image rebuilds as dependencies are
> updated. Tianon (a primary maintainer of the docker images) has noted
> that they limit frequency of the debian base containers, because every
> rebuild of the base container triggers an avalance of downstream
> rebuilds. CNPG was doing daily rebuilds for awhile, and every time any
> python dependency was updated you'd get a new image - boto3 was
> notorious for very frequent updates. So with a different image version
> for every day, a single server running multiple copies of postgres might
> easily end up with multiple image versions on the server as copies are
> slowly updated.
>
>
> I see this as a symptom of a different, bigger issue: that package
> versions, and all transitive dependencies, should be version pinned when
> building container images. I haven't seen too many examples of taking the
> effort to do this. But it's the only way to have a way to re-run building
> images and guarantee outputs that are reproducible. Once you have this in
> place, you can decide how and when you upgrade which versions.
>
I'm guessing most container builders are just not interested in doing that
much work. It's easier to just "always upgrade", but as noted that comes
with a whole different set of problems. It's only really feasible if you
manage to first reduce the set of dependencies substantially.
>
> Actually, even version pinning is not enough, unless the package
> system guarantees that a version of a package is strictly immutable (and
> AFAIK this is usually not the case). So digest pinning is essentially
> required.
>
Debian (as this was talking about it) is actually doing a very good job ot
that these days, though they're not there all the way. But
https://tests.reproducible-builds.org/debian/reproducible.htmlshows they're
doing really well.
--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/;
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/;
^ permalink raw reply [nested|flat] 10+ messages in thread
* Re: deb package sizes
2025-01-09 08:53 deb package sizes Jeremy Schneider <[email protected]>
2025-01-09 09:07 ` Re: deb package sizes Christoph Berg <[email protected]>
2025-01-09 16:06 ` Re: deb package sizes Álvaro Hernández <[email protected]>
2025-01-09 17:08 ` Re: deb package sizes Jeremy Schneider <[email protected]>
2025-01-09 22:40 ` Re: deb package sizes Álvaro Hernández <[email protected]>
2025-01-10 09:52 ` Re: deb package sizes Magnus Hagander <[email protected]>
@ 2025-01-10 11:32 ` Álvaro Hernández <[email protected]>
2025-01-10 12:17 ` Re: deb package sizes Christoph Berg <[email protected]>
1 sibling, 1 reply; 10+ messages in thread
From: Álvaro Hernández @ 2025-01-10 11:32 UTC (permalink / raw)
To: Magnus Hagander <[email protected]>; +Cc: Jeremy Schneider <[email protected]>; Christoph Berg <[email protected]>; [email protected]
On 10/1/25 10:52, Magnus Hagander wrote:
> On Thu, Jan 9, 2025 at 11:40 PM Álvaro Hernández <[email protected]> wrote:
>
>
>
> On 9/1/25 18:08, Jeremy Schneider wrote:
>> On Thu, 9 Jan 2025 17:06:57 +0100
>> Álvaro Hernández<[email protected]> <mailto:[email protected]> wrote:
>>
>>> On 9/1/25 10:07, Christoph Berg wrote:
>>>> Re: Jeremy Schneider
>>>>> I'm wondering if there might be any support for providing a
>>>>> "postgresql-slim" package on PGDG which excludes llvm and python? I
>>>>> think this might almost cut the total install size in half, and I
>>>>> think there might be many users who would value having the option.
>>>>>
>>>> Hi,
>>>>
>>>> could you explain why 250 MB is too much? Disk space these days is
>>>> ultra cheap
>>> Hi Christoph.
>>>
>>> Container images allow (are meant to) contain only the necessary
>>> files needed to run the process that will be run when the image is
>>> run. As such, any additional file poses two main problems:
>>>
>>> * Disk space is cheap. Bandwidth not so much. Time to start a
>>>
>>> * Security analysis. Unneeded files (specially binaries, but not
>> Another concern is the impact of image rebuilds as dependencies are
>> updated. Tianon (a primary maintainer of the docker images) has noted
>> that they limit frequency of the debian base containers, because every
>> rebuild of the base container triggers an avalance of downstream
>> rebuilds. CNPG was doing daily rebuilds for awhile, and every time any
>> python dependency was updated you'd get a new image - boto3 was
>> notorious for very frequent updates. So with a different image version
>> for every day, a single server running multiple copies of postgres might
>> easily end up with multiple image versions on the server as copies are
>> slowly updated.
>
> I see this as a symptom of a different, bigger issue: that
> package versions, and all transitive dependencies, should be
> version pinned when building container images. I haven't seen too
> many examples of taking the effort to do this. But it's the only
> way to have a way to re-run building images and guarantee outputs
> that are reproducible. Once you have this in place, you can decide
> how and when you upgrade which versions.
>
>
> I'm guessing most container builders are just not interested in doing
> that much work. It's easier to just "always upgrade", but as noted
> that comes with a whole different set of problems. It's only really
> feasible if you manage to first reduce the set of dependencies
> substantially.
Yes, it comes with a whole set of problems. The main one, other
than upgrades, is that you may end up with inconsistent environments:
cases where not all images deployed are the same because some
dependencies have different versions. This may also lead to different
CVEs present on different servers. This if far from ideal and a problem
that is starting to be more and more visible.
While container builders may not be interested in doing all this
work, I think that it should be done regardless. And over time, it will
be done more and more. When security and supply-chain attacks are a
serious concern, precise knowledge of your dependencies is key.
>
>
> Actually, even version pinning is not enough, unless the
> package system guarantees that a version of a package is strictly
> immutable (and AFAIK this is usually not the case). So digest
> pinning is essentially required.
>
>
> Debian (as this was talking about it) is actually doing a very good
> job ot that these days, though they're not there all the way. But
> https://tests.reproducible-builds.org/debian/reproducible.htmlshows
> they're doing really well.
Debian is doing a great job towards reproducibility of the build
efforts of their packages. However, AFAIK a given package version can be
updated with a different content --and that's why a service like
https://snapshot.debian.org exists.
Álvaro
--
Alvaro Hernandez
-----------
OnGres
^ permalink raw reply [nested|flat] 10+ messages in thread
* Re: deb package sizes
2025-01-09 08:53 deb package sizes Jeremy Schneider <[email protected]>
2025-01-09 09:07 ` Re: deb package sizes Christoph Berg <[email protected]>
2025-01-09 16:06 ` Re: deb package sizes Álvaro Hernández <[email protected]>
2025-01-09 17:08 ` Re: deb package sizes Jeremy Schneider <[email protected]>
2025-01-09 22:40 ` Re: deb package sizes Álvaro Hernández <[email protected]>
2025-01-10 09:52 ` Re: deb package sizes Magnus Hagander <[email protected]>
2025-01-10 11:32 ` Re: deb package sizes Álvaro Hernández <[email protected]>
@ 2025-01-10 12:17 ` Christoph Berg <[email protected]>
0 siblings, 0 replies; 10+ messages in thread
From: Christoph Berg @ 2025-01-10 12:17 UTC (permalink / raw)
To: Álvaro Hernández <[email protected]>; +Cc: Magnus Hagander <[email protected]>; Jeremy Schneider <[email protected]>; [email protected]
Re: Álvaro Hernández
> Debian is doing a great job towards reproducibility of the build efforts
> of their packages. However, AFAIK a given package version can be updated
> with a different content --and that's why a service like
> https://snapshot.debian.org exists.
That will never happen, new packages always have new version/revision numbers.
Same on apt.postgresql.org.
Christoph
^ permalink raw reply [nested|flat] 10+ messages in thread
* Re: deb package sizes
2025-01-09 08:53 deb package sizes Jeremy Schneider <[email protected]>
2025-01-09 09:07 ` Re: deb package sizes Christoph Berg <[email protected]>
2025-01-09 16:06 ` Re: deb package sizes Álvaro Hernández <[email protected]>
2025-01-09 17:08 ` Re: deb package sizes Jeremy Schneider <[email protected]>
2025-01-09 22:40 ` Re: deb package sizes Álvaro Hernández <[email protected]>
2025-01-10 09:52 ` Re: deb package sizes Magnus Hagander <[email protected]>
@ 2025-01-21 09:26 ` Cédric Villemain <[email protected]>
1 sibling, 0 replies; 10+ messages in thread
From: Cédric Villemain @ 2025-01-21 09:26 UTC (permalink / raw)
To: Magnus Hagander <[email protected]>; Álvaro Hernández <[email protected]>; +Cc: Jeremy Schneider <[email protected]>; Christoph Berg <[email protected]>; [email protected]
On 10/01/2025 10:52, Magnus Hagander wrote:
> On Thu, Jan 9, 2025 at 11:40 PM Álvaro Hernández <[email protected]> wrote:
>
>
>
> On 9/1/25 18:08, Jeremy Schneider wrote:
>> On Thu, 9 Jan 2025 17:06:57 +0100
>> Álvaro Hernández<[email protected]> <mailto:[email protected]> wrote:
>>
>>> On 9/1/25 10:07, Christoph Berg wrote:
>>>> Re: Jeremy Schneider
>>>>> I'm wondering if there might be any support for providing a
>>>>> "postgresql-slim" package on PGDG which excludes llvm and python? I
>>>>> think this might almost cut the total install size in half, and I
>>>>> think there might be many users who would value having the option.
>>>>>
>>>> Hi,
>>>>
>>>> could you explain why 250 MB is too much? Disk space these days is
>>>> ultra cheap
>>> Hi Christoph.
>>>
>>> Container images allow (are meant to) contain only the necessary
>>> files needed to run the process that will be run when the image is
>>> run. As such, any additional file poses two main problems:
>>>
>>> * Disk space is cheap. Bandwidth not so much. Time to start a
>>>
>>> * Security analysis. Unneeded files (specially binaries, but not
>> Another concern is the impact of image rebuilds as dependencies are
>> updated. Tianon (a primary maintainer of the docker images) has noted
>> that they limit frequency of the debian base containers, because every
>> rebuild of the base container triggers an avalance of downstream
>> rebuilds. CNPG was doing daily rebuilds for awhile, and every time any
>> python dependency was updated you'd get a new image - boto3 was
>> notorious for very frequent updates. So with a different image version
>> for every day, a single server running multiple copies of postgres might
>> easily end up with multiple image versions on the server as copies are
>> slowly updated.
>
> I see this as a symptom of a different, bigger issue: that
> package versions, and all transitive dependencies, should be
> version pinned when building container images. I haven't seen too
> many examples of taking the effort to do this. But it's the only
> way to have a way to re-run building images and guarantee outputs
> that are reproducible. Once you have this in place, you can decide
> how and when you upgrade which versions.
>
>
> I'm guessing most container builders are just not interested in doing
> that much work. It's easier to just "always upgrade", but as noted
> that comes with a whole different set of problems. It's only really
> feasible if you manage to first reduce the set of dependencies
> substantially.
>
>
> Actually, even version pinning is not enough, unless the
> package system guarantees that a version of a package is strictly
> immutable (and AFAIK this is usually not the case). So digest
> pinning is essentially required.
>
>
> Debian (as this was talking about it) is actually doing a very good
> job ot that these days, though they're not there all the way. But
> https://tests.reproducible-builds.org/debian/reproducible.htmlshows
> they're doing really well.
Also on debian.net : https://amd64.reproduce.debian.net/#postgresql-17
for "non fancy" webpage.
There was a talk on this very topic, at minidebconf recently (by kpcyrd):
https://toulouse2024.mini.debconf.org/talks/4-reproducible-builds-rebuilding-what-is-distributed-fro...
"Since about a month we’ve also been rebuilding trying to exactly match
the builds being distributed via ftp.d.o - this talk will describe the
setup and the lessons learned so far, and why the results currently are
what they are (spoiler: less <30% reproducible) and what we can do to
fix that."
And rebuilderd is surely of interest for people willing to work on
reproducible builds: https://github.com/kpcyrd/rebuilderd
---
Cédric Villemain +33 6 20 30 22 52
https://www.Data-Bene.io
PostgreSQL Support, Expertise, Training, R&D
^ permalink raw reply [nested|flat] 10+ messages in thread
* Re: deb package sizes
2025-01-09 08:53 deb package sizes Jeremy Schneider <[email protected]>
@ 2025-01-09 15:43 ` Álvaro Hernández <[email protected]>
1 sibling, 0 replies; 10+ messages in thread
From: Álvaro Hernández @ 2025-01-09 15:43 UTC (permalink / raw)
To: Jeremy Schneider <[email protected]>; [email protected]
On 9/1/25 9:53, Jeremy Schneider wrote:
> Hello, I hope I found a good mailing list for this topic?
>
> Recently, I've been spending some time looking at the official Postgres
> docker images. https://hub.docker.com/_/postgres/
Hi Jeremy.
Nitpicking a bit, but I'd call these the "Official Docker Postgres
images". They are official from Docker's perspective. I say this for
general awareness, not everybody understand it's like this.
> With docker images, we like to get the container images to be as
> minimal and small as possible.
Agreed.
> I have spent a little time looking at the
> make-up of the official docker images from a size perspective, which is
> driven by debian package sizes.
In my opinion, "system packages" (deb, rpm, etc) are not
necessarily the best way to compose container images. They are designed
for "systems", and usually contain many files that may not be needed on
a container.
> Before adding any PGDG postgres packages or dependencies, our base OS
> container image is 74MB and includes about 88 debian packages.
Something to consider here is using Distroless
(https://github.com/GoogleContainerTools/distroless) which is a bit of a
misnomer as it really it's based on Debian too.
> We install only 5 PGDG postgres packages: postgresql,
> postgresql-client, postgresql-client-common, postgresql-common and
> libpq5. The "common" packages are tiny, libpq is 1MB, client is 10MB
> and the postgresql package itself is 60MB.
>
> What's more interesting is all of the additional dependencies that the
> postgresql package pulls in: an extra 53 debian packages that are over
> 250MB in total size.
>
> The biggest size contributors are libllvm & libz3 (143MB), libperl &
> perl-modules (45MB total) and libicu (36MB). These three things alone
> make up 64% of the total postgres-specific bytes.
While the results are not too different from your analysis, I'd do
it from the layers that compose the image itself. Here's a simple way to
do it:
$ docker history --no-trunc --format '{{ .Size }} {{ .CreatedBy }}'
postgres |egrep '^[0-9]+(\.[0-9]+)?MB' | cut -b 1-72
330MB RUN /bin/sh -c set -ex; export PYTHONDONTWRITEBYTECODE=1; dpkg
3.61MB RUN /bin/sh -c set -eux; apt-get update; apt-get install -y --n
26.9MB RUN /bin/sh -c set -eux; if [ -f /etc/dpkg/dpkg.cfg.d/docker ];
4.27MB RUN /bin/sh -c set -eux; savedAptMark="$(apt-mark showmanual)";
10.8MB RUN /bin/sh -c set -ex; apt-get update; apt-get install -y --no
85.2MB # debian.sh --arch 'amd64' out/ 'bookworm' '@1734912000'
(see attached a non-truncated version for completeness)
"Base" image is 85MB, Postgres plus dependencies is 330MB (which
you distilled in more detail) and then there's some other 27MB in
locales and 11MB in additional tools.
Also to note is that Docker's official Postgres image compiles from
source packages, not just installs from PGDG (e.g. see
https://github.com/docker-library/postgres/blob/cb049360d9a316e429740d47431e0d6fa129d11a/17/bookworm...).
> I'm wondering if there might be any support for providing a
> "postgresql-slim" package on PGDG which excludes llvm and python? I
> think this might almost cut the total install size in half, and I think
> there might be many users who would value having the option.
>
> Even though ICU is a larger package, I would argue for still
> including it in a "slim" build. Because of the drama around glibc
> collation I view ICU as especially important to make available.
>
> Interested to know others' thoughts about having a slimmer package.
+1
I believe there should be place for slimmer, or even better,
user-configurable Postgres images. Different use cases need different
containers. Postgres on testcontainers use case needs little to no
additional features, while a production setup may require different
additional tools. Similarly, different environments (ICU / not ICU, sets
of locales, parallel query or not) may require different images. Having
choice here would be of great benefit.
Álvaro
--
Alvaro Hernandez
-----------
OnGres
330MB RUN /bin/sh -c set -ex; export PYTHONDONTWRITEBYTECODE=1; dpkgArch="$(dpkg --print-architecture)"; aptRepo="[ signed-by=/usr/local/share/keyrings/postgres.gpg.asc ] http://apt.postgresql.org/pub/repos/apt/ bookworm-pgdg main $PG_MAJOR"; case "$dpkgArch" in amd64 | arm64 | ppc64el | s390x) echo "deb $aptRepo" > /etc/apt/sources.list.d/pgdg.list; apt-get update; ;; *) echo "deb-src $aptRepo" > /etc/apt/sources.list.d/pgdg.list; savedAptMark="$(apt-mark showmanual)"; tempDir="$(mktemp -d)"; cd "$tempDir"; apt-get update; apt-get install -y --no-install-recommends dpkg-dev; echo "deb [ trusted=yes ] file://$tempDir ./" > /etc/apt/sources.list.d/temp.list; _update_repo() { dpkg-scanpackages . > Packages; apt-get -o Acquire::GzipIndexes=false update; }; _update_repo; nproc="$(nproc)"; export DEB_BUILD_OPTIONS="nocheck parallel=$nproc"; apt-get build-dep -y postgresql-common pgdg-keyring; apt-get source --compile postgresql-common pgdg-keyring; _update_repo; apt-get build-dep -y "postgresql-$PG_MAJOR=$PG_VERSION"; apt-get source --compile "postgresql-$PG_MAJOR=$PG_VERSION"; apt-mark showmanual | xargs apt-mark auto > /dev/null; apt-mark manual $savedAptMark; ls -lAFh; _update_repo; grep '^Package: ' Packages; cd /; ;; esac; apt-get install -y --no-install-recommends postgresql-common; sed -ri 's/#(create_main_cluster) .*$/\1 = false/' /etc/postgresql-common/createcluster.conf; apt-get install -y --no-install-recommends "postgresql-$PG_MAJOR=$PG_VERSION" ; rm -rf /var/lib/apt/lists/*; if [ -n "$tempDir" ]; then apt-get purge -y --auto-remove; rm -rf "$tempDir" /etc/apt/sources.list.d/temp.list; fi; find /usr -name '*.pyc' -type f -exec bash -c 'for pyc; do dpkg -S "$pyc" &> /dev/null || rm -vf "$pyc"; done' -- '{}' +; postgres --version # buildkit
3.61MB RUN /bin/sh -c set -eux; apt-get update; apt-get install -y --no-install-recommends libnss-wrapper xz-utils zstd ; rm -rf /var/lib/apt/lists/* # buildkit
26.9MB RUN /bin/sh -c set -eux; if [ -f /etc/dpkg/dpkg.cfg.d/docker ]; then grep -q '/usr/share/locale' /etc/dpkg/dpkg.cfg.d/docker; sed -ri '/\/usr\/share\/locale/d' /etc/dpkg/dpkg.cfg.d/docker; ! grep -q '/usr/share/locale' /etc/dpkg/dpkg.cfg.d/docker; fi; apt-get update; apt-get install -y --no-install-recommends locales; rm -rf /var/lib/apt/lists/*; echo 'en_US.UTF-8 UTF-8' >> /etc/locale.gen; locale-gen; locale -a | grep 'en_US.utf8' # buildkit
4.27MB RUN /bin/sh -c set -eux; savedAptMark="$(apt-mark showmanual)"; apt-get update; apt-get install -y --no-install-recommends ca-certificates wget; rm -rf /var/lib/apt/lists/*; dpkgArch="$(dpkg --print-architecture | awk -F- '{ print $NF }')"; wget -O /usr/local/bin/gosu "https://github.com/tianon/gosu/releases/download/$GOSU_VERSION/gosu-$dpkgArch";; wget -O /usr/local/bin/gosu.asc "https://github.com/tianon/gosu/releases/download/$GOSU_VERSION/gosu-$dpkgArch.asc";; export GNUPGHOME="$(mktemp -d)"; gpg --batch --keyserver hkps://keys.openpgp.org --recv-keys B42F6819007F00F88E364FD4036A9C25BF357DD4; gpg --batch --verify /usr/local/bin/gosu.asc /usr/local/bin/gosu; gpgconf --kill all; rm -rf "$GNUPGHOME" /usr/local/bin/gosu.asc; apt-mark auto '.*' > /dev/null; [ -z "$savedAptMark" ] || apt-mark manual $savedAptMark > /dev/null; apt-get purge -y --auto-remove -o APT::AutoRemove::RecommendsImportant=false; chmod +x /usr/local/bin/gosu; gosu --version; gosu nobody true # buildkit
10.8MB RUN /bin/sh -c set -ex; apt-get update; apt-get install -y --no-install-recommends gnupg less ; rm -rf /var/lib/apt/lists/* # buildkit
85.2MB # debian.sh --arch 'amd64' out/ 'bookworm' '@1734912000'
Attachments:
[text/plain] postgres-image.layers.size.txt (3.7K, 2-postgres-image.layers.size.txt)
download | inline:
330MB RUN /bin/sh -c set -ex; export PYTHONDONTWRITEBYTECODE=1; dpkgArch="$(dpkg --print-architecture)"; aptRepo="[ signed-by=/usr/local/share/keyrings/postgres.gpg.asc ] http://apt.postgresql.org/pub/repos/apt/ bookworm-pgdg main $PG_MAJOR"; case "$dpkgArch" in amd64 | arm64 | ppc64el | s390x) echo "deb $aptRepo" > /etc/apt/sources.list.d/pgdg.list; apt-get update; ;; *) echo "deb-src $aptRepo" > /etc/apt/sources.list.d/pgdg.list; savedAptMark="$(apt-mark showmanual)"; tempDir="$(mktemp -d)"; cd "$tempDir"; apt-get update; apt-get install -y --no-install-recommends dpkg-dev; echo "deb [ trusted=yes ] file://$tempDir ./" > /etc/apt/sources.list.d/temp.list; _update_repo() { dpkg-scanpackages . > Packages; apt-get -o Acquire::GzipIndexes=false update; }; _update_repo; nproc="$(nproc)"; export DEB_BUILD_OPTIONS="nocheck parallel=$nproc"; apt-get build-dep -y postgresql-common pgdg-keyring; apt-get source --compile postgresql-common pgdg-keyring; _update_repo; apt-get build-dep -y "postgresql-$PG_MAJOR=$PG_VERSION"; apt-get source --compile "postgresql-$PG_MAJOR=$PG_VERSION"; apt-mark showmanual | xargs apt-mark auto > /dev/null; apt-mark manual $savedAptMark; ls -lAFh; _update_repo; grep '^Package: ' Packages; cd /; ;; esac; apt-get install -y --no-install-recommends postgresql-common; sed -ri 's/#(create_main_cluster) .*$/\1 = false/' /etc/postgresql-common/createcluster.conf; apt-get install -y --no-install-recommends "postgresql-$PG_MAJOR=$PG_VERSION" ; rm -rf /var/lib/apt/lists/*; if [ -n "$tempDir" ]; then apt-get purge -y --auto-remove; rm -rf "$tempDir" /etc/apt/sources.list.d/temp.list; fi; find /usr -name '*.pyc' -type f -exec bash -c 'for pyc; do dpkg -S "$pyc" &> /dev/null || rm -vf "$pyc"; done' -- '{}' +; postgres --version # buildkit
3.61MB RUN /bin/sh -c set -eux; apt-get update; apt-get install -y --no-install-recommends libnss-wrapper xz-utils zstd ; rm -rf /var/lib/apt/lists/* # buildkit
26.9MB RUN /bin/sh -c set -eux; if [ -f /etc/dpkg/dpkg.cfg.d/docker ]; then grep -q '/usr/share/locale' /etc/dpkg/dpkg.cfg.d/docker; sed -ri '/\/usr\/share\/locale/d' /etc/dpkg/dpkg.cfg.d/docker; ! grep -q '/usr/share/locale' /etc/dpkg/dpkg.cfg.d/docker; fi; apt-get update; apt-get install -y --no-install-recommends locales; rm -rf /var/lib/apt/lists/*; echo 'en_US.UTF-8 UTF-8' >> /etc/locale.gen; locale-gen; locale -a | grep 'en_US.utf8' # buildkit
4.27MB RUN /bin/sh -c set -eux; savedAptMark="$(apt-mark showmanual)"; apt-get update; apt-get install -y --no-install-recommends ca-certificates wget; rm -rf /var/lib/apt/lists/*; dpkgArch="$(dpkg --print-architecture | awk -F- '{ print $NF }')"; wget -O /usr/local/bin/gosu "https://github.com/tianon/gosu/releases/download/$GOSU_VERSION/gosu-$dpkgArch"; wget -O /usr/local/bin/gosu.asc "https://github.com/tianon/gosu/releases/download/$GOSU_VERSION/gosu-$dpkgArch.asc"; export GNUPGHOME="$(mktemp -d)"; gpg --batch --keyserver hkps://keys.openpgp.org --recv-keys B42F6819007F00F88E364FD4036A9C25BF357DD4; gpg --batch --verify /usr/local/bin/gosu.asc /usr/local/bin/gosu; gpgconf --kill all; rm -rf "$GNUPGHOME" /usr/local/bin/gosu.asc; apt-mark auto '.*' > /dev/null; [ -z "$savedAptMark" ] || apt-mark manual $savedAptMark > /dev/null; apt-get purge -y --auto-remove -o APT::AutoRemove::RecommendsImportant=false; chmod +x /usr/local/bin/gosu; gosu --version; gosu nobody true # buildkit
10.8MB RUN /bin/sh -c set -ex; apt-get update; apt-get install -y --no-install-recommends gnupg less ; rm -rf /var/lib/apt/lists/* # buildkit
85.2MB # debian.sh --arch 'amd64' out/ 'bookworm' '@1734912000'
^ permalink raw reply [nested|flat] 10+ messages in thread
end of thread, other threads:[~2025-01-21 09:26 UTC | newest]
Thread overview: 10+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2025-01-09 08:53 deb package sizes Jeremy Schneider <[email protected]>
2025-01-09 09:07 ` Christoph Berg <[email protected]>
2025-01-09 16:06 ` Álvaro Hernández <[email protected]>
2025-01-09 17:08 ` Jeremy Schneider <[email protected]>
2025-01-09 22:40 ` Álvaro Hernández <[email protected]>
2025-01-10 09:52 ` Magnus Hagander <[email protected]>
2025-01-10 11:32 ` Álvaro Hernández <[email protected]>
2025-01-10 12:17 ` Christoph Berg <[email protected]>
2025-01-21 09:26 ` Cédric Villemain <[email protected]>
2025-01-09 15:43 ` Álvaro Hernández <[email protected]>
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox