public inbox for [email protected]  
help / color / mirror / Atom feed
From: Jianghua Yang <[email protected]>
To: Michael Paquier <[email protected]>
Cc: [email protected]
Subject: Re: basebackup: add missing deflateEnd() in gzip compression sink
Date: Sat, 21 Mar 2026 14:22:25 -0700
Message-ID: <CAAZLFmQOH+fy6=ybEQ-xGdX9Zee2xkXhfuzCfbWweXxJ_6iD-A@mail.gmail.com> (raw)
In-Reply-To: <CAAZLFmQeS4AFC6mNQNhyZWGUMXEkOmQA11UL28N-6c=CYtqnoA@mail.gmail.com>
References: <CAAZLFmQNJ0QNArpWEOZXwv=vbumcWKEHz-b1me5gBqRqG67EwQ@mail.gmail.com>
	<[email protected]>
	<CAAZLFmQeS4AFC6mNQNhyZWGUMXEkOmQA11UL28N-6c=CYtqnoA@mail.gmail.com>

 Hi Michael,





  Thanks for the review and backpatch suggestion.



  Here is v1 as a two-patch series. The first patch is unchanged from the


  original submission. I've added a second patch that fixes another resource
  leak I noticed nearby in pg_basebackup:





    v1-0001: basebackup: add missing deflateEnd() calls in gzip compression
sink

    v1-0002: pg_basebackup: add missing close() for incremental manifest
file




  In the incremental backup code path, the file descriptor opened for
reading

  the manifest is never closed after the upload completes. Since the backup
  can run for a long time after this point, the leaked descriptor remains


  open for the entire duration.

   Regards,
  Jianghua Yang

Jianghua Yang <[email protected]> 于2026年3月21日周六 06:09写道:

>   Thanks for reviewing, Michael.
>
>   > based on the argument of the permanent connection my take is that
>   > this cleanup warrants a backpatch.
>
>   Agreed. The current code leaks zlib internal state on every archive,
>   and on a long-lived replication connection those allocations just
>   pile up with no cleanup until the connection ends.
>
> Michael Paquier <[email protected]> 于2026年3月20日周五 23:09写道:
>
>> On Fri, Mar 20, 2026 at 10:02:44AM -0700, Jianghua Yang wrote:
>> > While reviewing the backup compression backends, I noticed that the gzip
>> > sink (basebackup_gzip.c) never calls deflateEnd(), unlike the lz4 and
>> > zstd sinks which properly free their compression contexts.
>> >
>> > This means each archive (one per tablespace) leaks ~256KB of zlib
>> > internal state until the backup's memory context is destroyed. On the
>> > ERROR path, the zlib context is not released at all since there is no
>> > gzip-specific cleanup callback.
>>
>> Yes, you are right on this one.  This code could be better in cleaning
>> up the resources it is working on (and note that all the other
>> gzip-related code paths call the end() part if doing an init() or an
>> init2()).  Leaking that once per archive may not look like a big deal,
>> but this is backend-side code we are talking about, and on a
>> persistent replication connection it is not really cool to lose the
>> context of this data with garbage coming from zlib piling up, so based
>> on the argument of the permanent connection my take is that this
>> cleanup warrants a backpatch.
>> --
>> Michael
>>
>


Attachments:

  [application/octet-stream] v1-0002-pg_basebackup-add-missing-close-for-incremental-mani.patch (1.2K, 3-v1-0002-pg_basebackup-add-missing-close-for-incremental-mani.patch)
  download | inline diff:
From b596449c1a5d38f8decf40101313b24c8b2eef70 Mon Sep 17 00:00:00 2001
From: Jianghua Yang <[email protected]>
Date: Sat, 21 Mar 2026 14:17:05 -0700
Subject: [PATCH] pg_basebackup: add missing close() for incremental manifest
 file

When uploading an incremental manifest to the server, the file
descriptor opened for reading the manifest was never closed after
the upload completed. Since the backup can run for a long time
after this point, the leaked descriptor would remain open for
the entire duration.

Add the missing close() call after the file has been fully read.

Author: Jianghua Yang <[email protected]>
---
 src/bin/pg_basebackup/pg_basebackup.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index fa169a8d642..630b445cb23 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -1874,6 +1874,8 @@ BaseBackup(char *compression_algorithm, char *compression_detail,
 		if (nbytes < 0)
 			pg_fatal("could not read file \"%s\": %m", incremental_manifest);
 
+		close(fd);
+
 		/* End the COPY operation. */
 		if (PQputCopyEnd(conn, NULL) < 0)
 			pg_fatal("could not send end-of-COPY: %s",
-- 
2.50.1 (Apple Git-155)



  [application/octet-stream] v1-0001-basebackup-add-missing-deflateEnd-calls-in-gzip-comp.patch (3.3K, 4-v1-0001-basebackup-add-missing-deflateEnd-calls-in-gzip-comp.patch)
  download | inline diff:
From 5d1edeceafd422072c17026dfd1f862a19feb792 Mon Sep 17 00:00:00 2001
From: Jianghua Yang <[email protected]>
Date: Fri, 20 Mar 2026 09:57:17 -0700
Subject: [PATCH] basebackup: add missing deflateEnd() calls in gzip
 compression sink

The gzip basebackup sink called deflateInit2() in begin_archive but
never called deflateEnd(), leaking zlib's internal compression state
(~256KB per archive) until the memory context was destroyed. It also
lacked a dedicated cleanup callback, so on ERROR the zlib context
was not released at all.

Add deflateEnd() at the end of bbsink_gzip_end_archive() for the
normal path, and add a new bbsink_gzip_cleanup() function for the
error path.  This brings the gzip sink in line with the lz4 and zstd
sinks, which already properly free their compression contexts in both
end_archive and cleanup callbacks.

Author: Jianghua Yang <[email protected]>
---
 src/backend/backup/basebackup_gzip.c | 27 ++++++++++++++++++++++++++-
 1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/src/backend/backup/basebackup_gzip.c b/src/backend/backup/basebackup_gzip.c
index 1ba25015ab7..c264a833429 100644
--- a/src/backend/backup/basebackup_gzip.c
+++ b/src/backend/backup/basebackup_gzip.c
@@ -32,6 +32,9 @@ typedef struct bbsink_gzip
 
 	/* Number of bytes staged in output buffer. */
 	size_t		bytes_written;
+
+	/* Has the zstream been initialized with deflateInit2? */
+	bool		zstream_initialized;
 } bbsink_gzip;
 
 static void bbsink_gzip_begin_backup(bbsink *sink);
@@ -39,6 +42,7 @@ static void bbsink_gzip_begin_archive(bbsink *sink, const char *archive_name);
 static void bbsink_gzip_archive_contents(bbsink *sink, size_t len);
 static void bbsink_gzip_manifest_contents(bbsink *sink, size_t len);
 static void bbsink_gzip_end_archive(bbsink *sink);
+static void bbsink_gzip_cleanup(bbsink *sink);
 static void *gzip_palloc(void *opaque, unsigned items, unsigned size);
 static void gzip_pfree(void *opaque, void *address);
 
@@ -51,7 +55,7 @@ static const bbsink_ops bbsink_gzip_ops = {
 	.manifest_contents = bbsink_gzip_manifest_contents,
 	.end_manifest = bbsink_forward_end_manifest,
 	.end_backup = bbsink_forward_end_backup,
-	.cleanup = bbsink_forward_cleanup
+	.cleanup = bbsink_gzip_cleanup
 };
 #endif
 
@@ -141,6 +145,7 @@ bbsink_gzip_begin_archive(bbsink *sink, const char *archive_name)
 		ereport(ERROR,
 				errcode(ERRCODE_INTERNAL_ERROR),
 				errmsg("could not initialize compression library"));
+	mysink->zstream_initialized = true;
 
 	/*
 	 * Add ".gz" to the archive name. Note that the pg_basebackup -z produces
@@ -266,6 +271,10 @@ bbsink_gzip_end_archive(bbsink *sink)
 		mysink->bytes_written = 0;
 	}
 
+	/* Release the compression resources. */
+	deflateEnd(zs);
+	mysink->zstream_initialized = false;
+
 	/* Must also pass on the information that this archive has ended. */
 	bbsink_forward_end_archive(sink);
 }
@@ -301,4 +310,20 @@ gzip_pfree(void *opaque, void *address)
 	pfree(address);
 }
 
+/*
+ * In case the backup fails, make sure we free the compression context by
+ * calling deflateEnd() if needed to avoid a resource leak.
+ */
+static void
+bbsink_gzip_cleanup(bbsink *sink)
+{
+	bbsink_gzip *mysink = (bbsink_gzip *) sink;
+
+	if (mysink->zstream_initialized)
+	{
+		deflateEnd(&mysink->zstream);
+		mysink->zstream_initialized = false;
+	}
+}
+
 #endif
-- 
2.50.1 (Apple Git-155)



view thread (6+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected]
  Subject: Re: basebackup: add missing deflateEnd() in gzip compression sink
  In-Reply-To: <CAAZLFmQOH+fy6=ybEQ-xGdX9Zee2xkXhfuzCfbWweXxJ_6iD-A@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox