public inbox for [email protected]
help / color / mirror / Atom feedbasebackup: add missing deflateEnd() in gzip compression sink
6+ messages / 2 participants
[nested] [flat]
* basebackup: add missing deflateEnd() in gzip compression sink
@ 2026-03-20 17:02 Jianghua Yang <[email protected]>
0 siblings, 1 reply; 6+ messages in thread
From: Jianghua Yang @ 2026-03-20 17:02 UTC (permalink / raw)
To: [email protected]
Hi hackers,
While reviewing the backup compression backends, I noticed that the gzip
sink (basebackup_gzip.c) never calls deflateEnd(), unlike the lz4 and
zstd sinks which properly free their compression contexts.
Specifically:
- lz4 calls LZ4F_freeCompressionContext() in both end_archive and a
dedicated bbsink_lz4_cleanup() callback.
- zstd calls ZSTD_freeCCtx() in both end_archive and a dedicated
bbsink_zstd_cleanup() callback.
- gzip calls deflateInit2() in begin_archive but never calls
deflateEnd(). Its cleanup callback is bbsink_forward_cleanup,
which just forwards to the next sink without releasing the zlib
context.
This means each archive (one per tablespace) leaks ~256KB of zlib
internal state until the backup's memory context is destroyed. On the
ERROR path, the zlib context is not released at all since there is no
gzip-specific cleanup callback.
The attached patch adds deflateEnd() at the end of
bbsink_gzip_end_archive() for the normal path, and introduces a new
bbsink_gzip_cleanup() for the error path. The new cleanup callback
follows the exact same call chain as bbsink_lz4_cleanup and
bbsink_zstd_cleanup:
PG_FINALLY (basebackup.c:1063)
-> bbsink_cleanup(sink) (basebackup.c:1065)
-> sink->bbs_ops->cleanup(sink) (basebackup_sink.h:269)
-> bbsink_gzip_cleanup() -- now calls deflateEnd()
Previously the gzip slot in this chain pointed to
bbsink_forward_cleanup, which just forwarded to the next sink
without doing any gzip-specific resource release.
Since z_stream is an embedded struct (not a pointer like LZ4F_cctx or
ZSTD_CCtx), a bool flag "zstream_initialized" is used to track whether
deflateEnd() needs to be called.
Tested with pg_basebackup using gzip compression at levels 1, 5
(default), and 9, including server-side compression. All produced
valid .tar.gz files that pass gzip -t integrity checks and restore
correctly.
Regards,
Jianghua Yang
Attachments:
[application/octet-stream] 0001-basebackup-add-missing-deflateEnd-calls-in-gzip-comp.patch (3.3K, 3-0001-basebackup-add-missing-deflateEnd-calls-in-gzip-comp.patch)
download | inline diff:
From 5d1edeceafd422072c17026dfd1f862a19feb792 Mon Sep 17 00:00:00 2001
From: Jianghua Yang <[email protected]>
Date: Fri, 20 Mar 2026 09:57:17 -0700
Subject: [PATCH] basebackup: add missing deflateEnd() calls in gzip
compression sink
The gzip basebackup sink called deflateInit2() in begin_archive but
never called deflateEnd(), leaking zlib's internal compression state
(~256KB per archive) until the memory context was destroyed. It also
lacked a dedicated cleanup callback, so on ERROR the zlib context
was not released at all.
Add deflateEnd() at the end of bbsink_gzip_end_archive() for the
normal path, and add a new bbsink_gzip_cleanup() function for the
error path. This brings the gzip sink in line with the lz4 and zstd
sinks, which already properly free their compression contexts in both
end_archive and cleanup callbacks.
Author: Jianghua Yang <[email protected]>
---
src/backend/backup/basebackup_gzip.c | 27 ++++++++++++++++++++++++++-
1 file changed, 26 insertions(+), 1 deletion(-)
diff --git a/src/backend/backup/basebackup_gzip.c b/src/backend/backup/basebackup_gzip.c
index 1ba25015ab7..c264a833429 100644
--- a/src/backend/backup/basebackup_gzip.c
+++ b/src/backend/backup/basebackup_gzip.c
@@ -32,6 +32,9 @@ typedef struct bbsink_gzip
/* Number of bytes staged in output buffer. */
size_t bytes_written;
+
+ /* Has the zstream been initialized with deflateInit2? */
+ bool zstream_initialized;
} bbsink_gzip;
static void bbsink_gzip_begin_backup(bbsink *sink);
@@ -39,6 +42,7 @@ static void bbsink_gzip_begin_archive(bbsink *sink, const char *archive_name);
static void bbsink_gzip_archive_contents(bbsink *sink, size_t len);
static void bbsink_gzip_manifest_contents(bbsink *sink, size_t len);
static void bbsink_gzip_end_archive(bbsink *sink);
+static void bbsink_gzip_cleanup(bbsink *sink);
static void *gzip_palloc(void *opaque, unsigned items, unsigned size);
static void gzip_pfree(void *opaque, void *address);
@@ -51,7 +55,7 @@ static const bbsink_ops bbsink_gzip_ops = {
.manifest_contents = bbsink_gzip_manifest_contents,
.end_manifest = bbsink_forward_end_manifest,
.end_backup = bbsink_forward_end_backup,
- .cleanup = bbsink_forward_cleanup
+ .cleanup = bbsink_gzip_cleanup
};
#endif
@@ -141,6 +145,7 @@ bbsink_gzip_begin_archive(bbsink *sink, const char *archive_name)
ereport(ERROR,
errcode(ERRCODE_INTERNAL_ERROR),
errmsg("could not initialize compression library"));
+ mysink->zstream_initialized = true;
/*
* Add ".gz" to the archive name. Note that the pg_basebackup -z produces
@@ -266,6 +271,10 @@ bbsink_gzip_end_archive(bbsink *sink)
mysink->bytes_written = 0;
}
+ /* Release the compression resources. */
+ deflateEnd(zs);
+ mysink->zstream_initialized = false;
+
/* Must also pass on the information that this archive has ended. */
bbsink_forward_end_archive(sink);
}
@@ -301,4 +310,20 @@ gzip_pfree(void *opaque, void *address)
pfree(address);
}
+/*
+ * In case the backup fails, make sure we free the compression context by
+ * calling deflateEnd() if needed to avoid a resource leak.
+ */
+static void
+bbsink_gzip_cleanup(bbsink *sink)
+{
+ bbsink_gzip *mysink = (bbsink_gzip *) sink;
+
+ if (mysink->zstream_initialized)
+ {
+ deflateEnd(&mysink->zstream);
+ mysink->zstream_initialized = false;
+ }
+}
+
#endif
--
2.50.1 (Apple Git-155)
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: basebackup: add missing deflateEnd() in gzip compression sink
@ 2026-03-21 06:09 Michael Paquier <[email protected]>
parent: Jianghua Yang <[email protected]>
0 siblings, 1 reply; 6+ messages in thread
From: Michael Paquier @ 2026-03-21 06:09 UTC (permalink / raw)
To: Jianghua Yang <[email protected]>; +Cc: [email protected]
On Fri, Mar 20, 2026 at 10:02:44AM -0700, Jianghua Yang wrote:
> While reviewing the backup compression backends, I noticed that the gzip
> sink (basebackup_gzip.c) never calls deflateEnd(), unlike the lz4 and
> zstd sinks which properly free their compression contexts.
>
> This means each archive (one per tablespace) leaks ~256KB of zlib
> internal state until the backup's memory context is destroyed. On the
> ERROR path, the zlib context is not released at all since there is no
> gzip-specific cleanup callback.
Yes, you are right on this one. This code could be better in cleaning
up the resources it is working on (and note that all the other
gzip-related code paths call the end() part if doing an init() or an
init2()). Leaking that once per archive may not look like a big deal,
but this is backend-side code we are talking about, and on a
persistent replication connection it is not really cool to lose the
context of this data with garbage coming from zlib piling up, so based
on the argument of the permanent connection my take is that this
cleanup warrants a backpatch.
--
Michael
Attachments:
[application/pgp-signature] signature.asc (833B, 2-signature.asc)
download
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: basebackup: add missing deflateEnd() in gzip compression sink
@ 2026-03-21 13:09 Jianghua Yang <[email protected]>
parent: Michael Paquier <[email protected]>
0 siblings, 1 reply; 6+ messages in thread
From: Jianghua Yang @ 2026-03-21 13:09 UTC (permalink / raw)
To: Michael Paquier <[email protected]>; +Cc: [email protected]
Thanks for reviewing, Michael.
> based on the argument of the permanent connection my take is that
> this cleanup warrants a backpatch.
Agreed. The current code leaks zlib internal state on every archive,
and on a long-lived replication connection those allocations just
pile up with no cleanup until the connection ends.
Michael Paquier <[email protected]> 于2026年3月20日周五 23:09写道:
> On Fri, Mar 20, 2026 at 10:02:44AM -0700, Jianghua Yang wrote:
> > While reviewing the backup compression backends, I noticed that the gzip
> > sink (basebackup_gzip.c) never calls deflateEnd(), unlike the lz4 and
> > zstd sinks which properly free their compression contexts.
> >
> > This means each archive (one per tablespace) leaks ~256KB of zlib
> > internal state until the backup's memory context is destroyed. On the
> > ERROR path, the zlib context is not released at all since there is no
> > gzip-specific cleanup callback.
>
> Yes, you are right on this one. This code could be better in cleaning
> up the resources it is working on (and note that all the other
> gzip-related code paths call the end() part if doing an init() or an
> init2()). Leaking that once per archive may not look like a big deal,
> but this is backend-side code we are talking about, and on a
> persistent replication connection it is not really cool to lose the
> context of this data with garbage coming from zlib piling up, so based
> on the argument of the permanent connection my take is that this
> cleanup warrants a backpatch.
> --
> Michael
>
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: basebackup: add missing deflateEnd() in gzip compression sink
@ 2026-03-21 21:22 Jianghua Yang <[email protected]>
parent: Jianghua Yang <[email protected]>
0 siblings, 1 reply; 6+ messages in thread
From: Jianghua Yang @ 2026-03-21 21:22 UTC (permalink / raw)
To: Michael Paquier <[email protected]>; +Cc: [email protected]
Hi Michael,
Thanks for the review and backpatch suggestion.
Here is v1 as a two-patch series. The first patch is unchanged from the
original submission. I've added a second patch that fixes another resource
leak I noticed nearby in pg_basebackup:
v1-0001: basebackup: add missing deflateEnd() calls in gzip compression
sink
v1-0002: pg_basebackup: add missing close() for incremental manifest
file
In the incremental backup code path, the file descriptor opened for
reading
the manifest is never closed after the upload completes. Since the backup
can run for a long time after this point, the leaked descriptor remains
open for the entire duration.
Regards,
Jianghua Yang
Jianghua Yang <[email protected]> 于2026年3月21日周六 06:09写道:
> Thanks for reviewing, Michael.
>
> > based on the argument of the permanent connection my take is that
> > this cleanup warrants a backpatch.
>
> Agreed. The current code leaks zlib internal state on every archive,
> and on a long-lived replication connection those allocations just
> pile up with no cleanup until the connection ends.
>
> Michael Paquier <[email protected]> 于2026年3月20日周五 23:09写道:
>
>> On Fri, Mar 20, 2026 at 10:02:44AM -0700, Jianghua Yang wrote:
>> > While reviewing the backup compression backends, I noticed that the gzip
>> > sink (basebackup_gzip.c) never calls deflateEnd(), unlike the lz4 and
>> > zstd sinks which properly free their compression contexts.
>> >
>> > This means each archive (one per tablespace) leaks ~256KB of zlib
>> > internal state until the backup's memory context is destroyed. On the
>> > ERROR path, the zlib context is not released at all since there is no
>> > gzip-specific cleanup callback.
>>
>> Yes, you are right on this one. This code could be better in cleaning
>> up the resources it is working on (and note that all the other
>> gzip-related code paths call the end() part if doing an init() or an
>> init2()). Leaking that once per archive may not look like a big deal,
>> but this is backend-side code we are talking about, and on a
>> persistent replication connection it is not really cool to lose the
>> context of this data with garbage coming from zlib piling up, so based
>> on the argument of the permanent connection my take is that this
>> cleanup warrants a backpatch.
>> --
>> Michael
>>
>
Attachments:
[application/octet-stream] v1-0002-pg_basebackup-add-missing-close-for-incremental-mani.patch (1.2K, 3-v1-0002-pg_basebackup-add-missing-close-for-incremental-mani.patch)
download | inline diff:
From b596449c1a5d38f8decf40101313b24c8b2eef70 Mon Sep 17 00:00:00 2001
From: Jianghua Yang <[email protected]>
Date: Sat, 21 Mar 2026 14:17:05 -0700
Subject: [PATCH] pg_basebackup: add missing close() for incremental manifest
file
When uploading an incremental manifest to the server, the file
descriptor opened for reading the manifest was never closed after
the upload completed. Since the backup can run for a long time
after this point, the leaked descriptor would remain open for
the entire duration.
Add the missing close() call after the file has been fully read.
Author: Jianghua Yang <[email protected]>
---
src/bin/pg_basebackup/pg_basebackup.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index fa169a8d642..630b445cb23 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -1874,6 +1874,8 @@ BaseBackup(char *compression_algorithm, char *compression_detail,
if (nbytes < 0)
pg_fatal("could not read file \"%s\": %m", incremental_manifest);
+ close(fd);
+
/* End the COPY operation. */
if (PQputCopyEnd(conn, NULL) < 0)
pg_fatal("could not send end-of-COPY: %s",
--
2.50.1 (Apple Git-155)
[application/octet-stream] v1-0001-basebackup-add-missing-deflateEnd-calls-in-gzip-comp.patch (3.3K, 4-v1-0001-basebackup-add-missing-deflateEnd-calls-in-gzip-comp.patch)
download | inline diff:
From 5d1edeceafd422072c17026dfd1f862a19feb792 Mon Sep 17 00:00:00 2001
From: Jianghua Yang <[email protected]>
Date: Fri, 20 Mar 2026 09:57:17 -0700
Subject: [PATCH] basebackup: add missing deflateEnd() calls in gzip
compression sink
The gzip basebackup sink called deflateInit2() in begin_archive but
never called deflateEnd(), leaking zlib's internal compression state
(~256KB per archive) until the memory context was destroyed. It also
lacked a dedicated cleanup callback, so on ERROR the zlib context
was not released at all.
Add deflateEnd() at the end of bbsink_gzip_end_archive() for the
normal path, and add a new bbsink_gzip_cleanup() function for the
error path. This brings the gzip sink in line with the lz4 and zstd
sinks, which already properly free their compression contexts in both
end_archive and cleanup callbacks.
Author: Jianghua Yang <[email protected]>
---
src/backend/backup/basebackup_gzip.c | 27 ++++++++++++++++++++++++++-
1 file changed, 26 insertions(+), 1 deletion(-)
diff --git a/src/backend/backup/basebackup_gzip.c b/src/backend/backup/basebackup_gzip.c
index 1ba25015ab7..c264a833429 100644
--- a/src/backend/backup/basebackup_gzip.c
+++ b/src/backend/backup/basebackup_gzip.c
@@ -32,6 +32,9 @@ typedef struct bbsink_gzip
/* Number of bytes staged in output buffer. */
size_t bytes_written;
+
+ /* Has the zstream been initialized with deflateInit2? */
+ bool zstream_initialized;
} bbsink_gzip;
static void bbsink_gzip_begin_backup(bbsink *sink);
@@ -39,6 +42,7 @@ static void bbsink_gzip_begin_archive(bbsink *sink, const char *archive_name);
static void bbsink_gzip_archive_contents(bbsink *sink, size_t len);
static void bbsink_gzip_manifest_contents(bbsink *sink, size_t len);
static void bbsink_gzip_end_archive(bbsink *sink);
+static void bbsink_gzip_cleanup(bbsink *sink);
static void *gzip_palloc(void *opaque, unsigned items, unsigned size);
static void gzip_pfree(void *opaque, void *address);
@@ -51,7 +55,7 @@ static const bbsink_ops bbsink_gzip_ops = {
.manifest_contents = bbsink_gzip_manifest_contents,
.end_manifest = bbsink_forward_end_manifest,
.end_backup = bbsink_forward_end_backup,
- .cleanup = bbsink_forward_cleanup
+ .cleanup = bbsink_gzip_cleanup
};
#endif
@@ -141,6 +145,7 @@ bbsink_gzip_begin_archive(bbsink *sink, const char *archive_name)
ereport(ERROR,
errcode(ERRCODE_INTERNAL_ERROR),
errmsg("could not initialize compression library"));
+ mysink->zstream_initialized = true;
/*
* Add ".gz" to the archive name. Note that the pg_basebackup -z produces
@@ -266,6 +271,10 @@ bbsink_gzip_end_archive(bbsink *sink)
mysink->bytes_written = 0;
}
+ /* Release the compression resources. */
+ deflateEnd(zs);
+ mysink->zstream_initialized = false;
+
/* Must also pass on the information that this archive has ended. */
bbsink_forward_end_archive(sink);
}
@@ -301,4 +310,20 @@ gzip_pfree(void *opaque, void *address)
pfree(address);
}
+/*
+ * In case the backup fails, make sure we free the compression context by
+ * calling deflateEnd() if needed to avoid a resource leak.
+ */
+static void
+bbsink_gzip_cleanup(bbsink *sink)
+{
+ bbsink_gzip *mysink = (bbsink_gzip *) sink;
+
+ if (mysink->zstream_initialized)
+ {
+ deflateEnd(&mysink->zstream);
+ mysink->zstream_initialized = false;
+ }
+}
+
#endif
--
2.50.1 (Apple Git-155)
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: basebackup: add missing deflateEnd() in gzip compression sink
@ 2026-03-23 00:31 Michael Paquier <[email protected]>
parent: Jianghua Yang <[email protected]>
0 siblings, 1 reply; 6+ messages in thread
From: Michael Paquier @ 2026-03-23 00:31 UTC (permalink / raw)
To: Jianghua Yang <[email protected]>; +Cc: [email protected]
On Sat, Mar 21, 2026 at 02:22:25PM -0700, Jianghua Yang wrote:
> v1-0001: basebackup: add missing deflateEnd() calls in gzip compression
> sink
After double-checking the whole code, I agree that this is a good
practice to have in the tree. However, the issue is not worth
bothering in back-branches as the server-side base backup gzip code
relies on allocation and free callbacks, with zlib internals doing
nothing with fds or more persistent states as far as I have read its
code. For the current use, we'd bloat this data once per tablespace
in a single base backup, safe even if the connection is persistent
(missed that in my first message).
What I am more worried about are future callers of this code, though,
and we care about having a end() call for each matching init[2]() call
in the tree in all the places that rely on gzip internals. So that's
a good practice on consistency ground, at least. For these reasons,
applied that on HEAD.
> v1-0002: pg_basebackup: add missing close() for incremental manifest
> file
This one does not matter. This resource is for a backup manifest and
we are talking about a single one for a single invocation of the
binary.
--
Michael
Attachments:
[application/pgp-signature] signature.asc (833B, 2-signature.asc)
download
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: basebackup: add missing deflateEnd() in gzip compression sink
@ 2026-03-23 01:36 Jianghua Yang <[email protected]>
parent: Michael Paquier <[email protected]>
0 siblings, 0 replies; 6+ messages in thread
From: Jianghua Yang @ 2026-03-23 01:36 UTC (permalink / raw)
To: Michael Paquier <[email protected]>; +Cc: [email protected]
Hi Michael,
Thank you for reviewing and committing the patch!
noted on v1-0002 — understood that the incremental manifest file close
isn't an issue there.
Thanks again.
Regards,
Jianghua Yang
Michael Paquier <[email protected]> 于2026年3月22日周日 17:32写道:
> On Sat, Mar 21, 2026 at 02:22:25PM -0700, Jianghua Yang wrote:
> > v1-0001: basebackup: add missing deflateEnd() calls in gzip compression
> > sink
>
> After double-checking the whole code, I agree that this is a good
> practice to have in the tree. However, the issue is not worth
> bothering in back-branches as the server-side base backup gzip code
> relies on allocation and free callbacks, with zlib internals doing
> nothing with fds or more persistent states as far as I have read its
> code. For the current use, we'd bloat this data once per tablespace
> in a single base backup, safe even if the connection is persistent
> (missed that in my first message).
>
> What I am more worried about are future callers of this code, though,
> and we care about having a end() call for each matching init[2]() call
> in the tree in all the places that rely on gzip internals. So that's
> a good practice on consistency ground, at least. For these reasons,
> applied that on HEAD.
>
> > v1-0002: pg_basebackup: add missing close() for incremental manifest
> > file
>
> This one does not matter. This resource is for a backup manifest and
> we are talking about a single one for a single invocation of the
> binary.
> --
> Michael
>
^ permalink raw reply [nested|flat] 6+ messages in thread
end of thread, other threads:[~2026-03-23 01:36 UTC | newest]
Thread overview: 6+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2026-03-20 17:02 basebackup: add missing deflateEnd() in gzip compression sink Jianghua Yang <[email protected]>
2026-03-21 06:09 ` Michael Paquier <[email protected]>
2026-03-21 13:09 ` Jianghua Yang <[email protected]>
2026-03-21 21:22 ` Jianghua Yang <[email protected]>
2026-03-23 00:31 ` Michael Paquier <[email protected]>
2026-03-23 01:36 ` Jianghua Yang <[email protected]>
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox