public inbox for [email protected]
help / color / mirror / Atom feedFrom: Antonin Houska <[email protected]>
To: Alvaro Herrera <[email protected]>
Cc: Baji Shaik <[email protected]>
Cc: [email protected]
Subject: Re: [PATCH] Fix REPACK decoding worker not cleaned up on FATAL exit
Date: Wed, 20 May 2026 09:54:06 +0200
Message-ID: <8656.1779263646@localhost> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
Alvaro Herrera <[email protected]> wrote:
> On 2026-May-17, Baji Shaik wrote:
>
> > v3 uses PG_ENSURE_ERROR_CLEANUP which:
> > - Handles both ERROR and FATAL exits
> > - Automatically cancels the callback on normal completion
> > (no slot leak)
> > - Runs before memory contexts are destroyed (no use-after-free)
>
> Yeah, looks good. I have pushed it, with some comment wordsmithing and
> other cosmetic changes.
>
> While looking at it, I realized that I didn't like the way
> stop_repack_decoding_worker() works, mainly because if there's no
> handle, we leak everything else -- and the way we initialize things
> means we leak the shared memory segment. This is maybe a rare case and
> just a small memory leak, but it seems better to do it nicely. So
> here's a followup patch that reworks that code. This also forced me to
> understand more clearly what is going on, so I rewrote the comments.
The call of shm_mq_detach() got lost, or do you rely on dsm_detach() to call
shm_mq_detach_callback() ? The latter does not free ->mqh_buffer. Since each
REPACK runs in a separate transaction, I wouldn't consider that a leak, but I
still think that explicit call of shm_mq_detach() makes the code a bit easier
to read (i.e. no need for the developer to check if the detaching happens
automatically).
/*
- * If we could not cancel the current sleep due to ERROR, do that before
- * we detach from the shared memory the condition variable is located in.
- * If we did not, the bgworker ERROR handling code would try and fail
- * badly.
+ * Now detach from our shared memory segment. In error cases there might
+ * still be messages from the worker in the queue, which ProcessInterrupts
+ * would try to read; this is pointless (and causes an assertion failure),
+ * so set the global pointer to NULL to have ProcessRepackMessages ignore
+ * them.
*/
- ConditionVariableCancelSleep();
-
- dsm_detach(decoding_worker->seg);
+ dsmseg = decoding_worker->seg;
pfree(decoding_worker);
decoding_worker = NULL;
+
+ /* We must also cancel the current sleep, if one is still set up */
+ ConditionVariableCancelSleep();
+
+ if (dsmseg != NULL)
+ dsm_detach(dsmseg);
I suppose the reason for the assertion failure was reading from the queue
after the backend had detached from it? Thanks for fixing that.
--
Antonin Houska
Web: https://www.cybertec-postgresql.com
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected]
Subject: Re: [PATCH] Fix REPACK decoding worker not cleaned up on FATAL exit
In-Reply-To: <8656.1779263646@localhost>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox