public inbox for [email protected]  
help / color / mirror / Atom feed
Re: Memory leak in pg_stat_statements when qtext file contains invalid encoding
3+ messages / 3 participants
[nested] [flat]

* Re: Memory leak in pg_stat_statements when qtext file contains invalid encoding
@ 2026-03-27 08:21  Lukas Fittl <[email protected]>
  0 siblings, 2 replies; 3+ messages in thread

From: Lukas Fittl @ 2026-03-27 08:21 UTC (permalink / raw)
  To: Gaurav Singh <[email protected]>; +Cc: [email protected]

Hi Gaurav,

On Fri, Mar 27, 2026 at 12:54 AM Gaurav Singh <[email protected]> wrote:
> If the qtext file contains an invalid encoding, pg_any_to_server calls ereport(ERROR) which longjmps out of the function.
> The cleanup code at the bottom of the function is never reached.
>
> LWLockRelease(pgss->lock);
> if (qbuffer)
> free(qbuffer);
> On every subsequent call, the malloc'd buffer (the entire file contents) is leaked, and the LWLock release is also skipped.

I don't think the analysis is correct in regards to the LWLock release
- that should be taken care of by LWLockReleaseAll on abort.

But I think you're correct about qbuffer - because that buffer is
using malloc (not palloc), its not part of any memory context, and so
it will happily leak on abort.

It appears our use of malloc in pg_stat_statements is so that we can
fail on OOM and return NULL without a jump. I think that makes sense
for when a GC cycle was triggered during regular query execution
(since we don't want to error the original query), but it seems like
just bubbling up the OOM if needed when querying the
pg_stat_statements function seems fine.

I wonder if its worth separating the two cases, since the issue you're
describing (the call to pg_any_to_server failing) only happens when
returning the query text file contents to the client. I think your
PG_FINALLY suggestion could also work, but it feels a bit tedious to
wrap the whole pg_stat_statements_internal function in it.

Thanks,
Lukas

PS: I would recommend reviewing the use of a text format email client
for posting to the Postgres mailing lists, or significantly reducing
your formatting when sending HTML emails - your email has a lot of
styling that is hard to read (even for me in Gmail), and even harder
to quote in a plain text email response.


--
Lukas Fittl






^ permalink  raw  reply  [nested|flat] 3+ messages in thread

* Re: Memory leak in pg_stat_statements when qtext file contains invalid encoding
@ 2026-03-27 08:49  Gaurav Singh <[email protected]>
  parent: Lukas Fittl <[email protected]>
  1 sibling, 0 replies; 3+ messages in thread

From: Gaurav Singh @ 2026-03-27 08:49 UTC (permalink / raw)
  To: Lukas Fittl <[email protected]>; +Cc: [email protected]

Hi Lukas,

Thank you for the correction on the LWLock. You are right, LWLockReleaseAll
on abort handles that. The leak is limited to the malloc'd qbuffer.

I thought about switching to palloc for the pg_stat_statements_internal
path, but I think it would change the existing OOM behavior in a way that
upstream may not want.

Currently, when qtext_load_file fails on OOM, it returns NULL and the
function continues gracefully, returning rows with NULL query text columns.
The user still gets their result set. With palloc, an OOM
would instead throw a hard ERROR, which changes the semantics from graceful
degradation to a failure.

Additionally, qtext_load_file is called from gc_qtexts (where an ERROR
during garbage collection would abort the user's actual in-flight query)
and from pgss_shmem_shutdown (where an ERROR could interfere with a clean
server stop). Creating a separate palloc-based variant just for
pg_stat_statements_internal would avoid those issues, but it would
still change the OOM behavior from silent degradation to a
visible error for that path.

The PG_TRY/PG_FINALLY approach preserves the existing malloc-based OOM
semantics exactly as they are today. The only thing it adds is
cleanup of the malloc'd buffer when pg_any_to_server throws an encoding
error. In terms of scope, it does not need to wrap the entire function. It
only needs to cover the section after LWLockAcquire where qbuffer is
live through the end of the hash iteration loop, which is where
pg_any_to_server can throw.

I can also scope the PG_FINALLY to just free(qbuffer) since
you confirmed LWLockReleaseAll already handles the lock on abort. That
would make it even more targeted. Happy to send a patch either way.

Apologies for the HTML formatting on the previous email.
I will use plain text going forward.


Thanks,

Gaurav

On Fri, Mar 27, 2026 at 1:52 PM Lukas Fittl <[email protected]> wrote:

> Hi Gaurav,
>
> On Fri, Mar 27, 2026 at 12:54 AM Gaurav Singh <[email protected]>
> wrote:
> > If the qtext file contains an invalid encoding, pg_any_to_server calls
> ereport(ERROR) which longjmps out of the function.
> > The cleanup code at the bottom of the function is never reached.
> >
> > LWLockRelease(pgss->lock);
> > if (qbuffer)
> > free(qbuffer);
> > On every subsequent call, the malloc'd buffer (the entire file contents)
> is leaked, and the LWLock release is also skipped.
>
> I don't think the analysis is correct in regards to the LWLock release
> - that should be taken care of by LWLockReleaseAll on abort.
>
> But I think you're correct about qbuffer - because that buffer is
> using malloc (not palloc), its not part of any memory context, and so
> it will happily leak on abort.
>
> It appears our use of malloc in pg_stat_statements is so that we can
> fail on OOM and return NULL without a jump. I think that makes sense
> for when a GC cycle was triggered during regular query execution
> (since we don't want to error the original query), but it seems like
> just bubbling up the OOM if needed when querying the
> pg_stat_statements function seems fine.
>
> I wonder if its worth separating the two cases, since the issue you're
> describing (the call to pg_any_to_server failing) only happens when
> returning the query text file contents to the client. I think your
> PG_FINALLY suggestion could also work, but it feels a bit tedious to
> wrap the whole pg_stat_statements_internal function in it.
>
> Thanks,
> Lukas
>
> PS: I would recommend reviewing the use of a text format email client
> for posting to the Postgres mailing lists, or significantly reducing
> your formatting when sending HTML emails - your email has a lot of
> styling that is hard to read (even for me in Gmail), and even harder
> to quote in a plain text email response.
>
>
> --
> Lukas Fittl
>


^ permalink  raw  reply  [nested|flat] 3+ messages in thread

* Re: Memory leak in pg_stat_statements when qtext file contains invalid encoding
@ 2026-03-27 08:59  Heikki Linnakangas <[email protected]>
  parent: Lukas Fittl <[email protected]>
  1 sibling, 0 replies; 3+ messages in thread

From: Heikki Linnakangas @ 2026-03-27 08:59 UTC (permalink / raw)
  To: Lukas Fittl <[email protected]>; Gaurav Singh <[email protected]>; +Cc: [email protected]

On 27/03/2026 10:21, Lukas Fittl wrote:
> Hi Gaurav,
> 
> On Fri, Mar 27, 2026 at 12:54 AM Gaurav Singh <[email protected]> wrote:
>> If the qtext file contains an invalid encoding, pg_any_to_server calls ereport(ERROR) which longjmps out of the function.
>> The cleanup code at the bottom of the function is never reached.
>>
>> LWLockRelease(pgss->lock);
>> if (qbuffer)
>> free(qbuffer);
>> On every subsequent call, the malloc'd buffer (the entire file contents) is leaked, and the LWLock release is also skipped.
> 
> I don't think the analysis is correct in regards to the LWLock release
> - that should be taken care of by LWLockReleaseAll on abort.
> 
> But I think you're correct about qbuffer - because that buffer is
> using malloc (not palloc), its not part of any memory context, and so
> it will happily leak on abort.

Yep

> It appears our use of malloc in pg_stat_statements is so that we can
> fail on OOM and return NULL without a jump. I think that makes sense
> for when a GC cycle was triggered during regular query execution
> (since we don't want to error the original query), but it seems like
> just bubbling up the OOM if needed when querying the
> pg_stat_statements function seems fine.
> 
> I wonder if its worth separating the two cases, since the issue you're
> describing (the call to pg_any_to_server failing) only happens when
> returning the query text file contents to the client. I think your
> PG_FINALLY suggestion could also work, but it feels a bit tedious to
> wrap the whole pg_stat_statements_internal function in it.

Hmm, perhaps. But there's a simpler, less invasive fix. When that code 
was written, we didn't have MCXT_ALLOC_HUGE nor MCXT_ALLOC_NO_OOM. Now 
that we do, we can just use palloc_extended(MCXT_ALLOC_HUGE | 
MCXT_ALLOC_NO_OOM) instead of raw malloc(). Per attached.

- Heikki


Attachments:

  [text/x-patch] 0001-Avoid-memory-leak-on-error-while-parsing-pg_stat_sta.patch (3.1K, 2-0001-Avoid-memory-leak-on-error-while-parsing-pg_stat_sta.patch)
  download | inline diff:
From a88fdd79e0131a25141f5efd84246026b3b7f6ff Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Fri, 27 Mar 2026 10:52:04 +0200
Subject: [PATCH 1/1] Avoid memory leak on error while parsing
 pg_stat_statements dump file

By using palloc() instead of raw malloc().

Reported-by: Gaurav Singh <[email protected]>
Reviewed-by: Lukas Fittl <[email protected]>
Discussion: https://www.postgresql.org/message-id/CAEcQ1bYR9s4eQLFDjzzJHU8fj-MTbmRpW-9J-r2gsCn+HEsynw@mail.gmail.com
---
 .../pg_stat_statements/pg_stat_statements.c   | 22 +++++++++++--------
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 6cb14824ec3..7975476b890 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -805,7 +805,7 @@ pgss_shmem_shutdown(int code, Datum arg)
 	if (fwrite(&pgss->stats, sizeof(pgssGlobalStats), 1, file) != 1)
 		goto error;
 
-	free(qbuffer);
+	pfree(qbuffer);
 	qbuffer = NULL;
 
 	if (FreeFile(file))
@@ -829,7 +829,8 @@ error:
 			(errcode_for_file_access(),
 			 errmsg("could not write file \"%s\": %m",
 					PGSS_DUMP_FILE ".tmp")));
-	free(qbuffer);
+	if (qbuffer)
+		pfree(qbuffer);
 	if (file)
 		FreeFile(file);
 	unlink(PGSS_DUMP_FILE ".tmp");
@@ -1825,7 +1826,8 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo,
 			pgss->extent != extent ||
 			pgss->gc_count != gc_count)
 		{
-			free(qbuffer);
+			if (qbuffer)
+				pfree(qbuffer);
 			qbuffer = qtext_load_file(&qbuffer_size);
 		}
 	}
@@ -2046,7 +2048,8 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo,
 
 	LWLockRelease(pgss->lock);
 
-	free(qbuffer);
+	if (qbuffer)
+		pfree(qbuffer);
 }
 
 /* Number of output arguments (columns) for pg_stat_statements_info */
@@ -2333,7 +2336,7 @@ error:
 }
 
 /*
- * Read the external query text file into a malloc'd buffer.
+ * Read the external query text file into a palloc'd buffer.
  *
  * Returns NULL (without throwing an error) if unable to read, eg
  * file not there or insufficient memory.
@@ -2375,7 +2378,7 @@ qtext_load_file(Size *buffer_size)
 
 	/* Allocate buffer; beware that off_t might be wider than size_t */
 	if (stat.st_size <= MaxAllocHugeSize)
-		buf = (char *) malloc(stat.st_size);
+		buf = (char *) palloc_extended(stat.st_size, MCXT_ALLOC_HUGE | MCXT_ALLOC_NO_OOM);
 	else
 		buf = NULL;
 	if (buf == NULL)
@@ -2414,7 +2417,7 @@ qtext_load_file(Size *buffer_size)
 						(errcode_for_file_access(),
 						 errmsg("could not read file \"%s\": %m",
 								PGSS_TEXT_FILE)));
-			free(buf);
+			pfree(buf);
 			CloseTransientFile(fd);
 			return NULL;
 		}
@@ -2625,7 +2628,7 @@ gc_qtexts(void)
 	else
 		pgss->mean_query_len = ASSUMED_LENGTH_INIT;
 
-	free(qbuffer);
+	pfree(qbuffer);
 
 	/*
 	 * OK, count a garbage collection cycle.  (Note: even though we have
@@ -2642,7 +2645,8 @@ gc_fail:
 	/* clean up resources */
 	if (qfile)
 		FreeFile(qfile);
-	free(qbuffer);
+	if (qbuffer)
+		pfree(qbuffer);
 
 	/*
 	 * Since the contents of the external file are now uncertain, mark all
-- 
2.47.3



^ permalink  raw  reply  [nested|flat] 3+ messages in thread


end of thread, other threads:[~2026-03-27 08:59 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2026-03-27 08:21 Re: Memory leak in pg_stat_statements when qtext file contains invalid encoding Lukas Fittl <[email protected]>
2026-03-27 08:49 ` Gaurav Singh <[email protected]>
2026-03-27 08:59 ` Heikki Linnakangas <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox