Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wPbkr-000tFP-00 for pgsql-hackers@arkaria.postgresql.org; Wed, 20 May 2026 07:54:13 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wPbko-006iRX-2J for pgsql-hackers@arkaria.postgresql.org; Wed, 20 May 2026 07:54:11 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wPbko-006iRP-1H for pgsql-hackers@lists.postgresql.org; Wed, 20 May 2026 07:54:11 +0000 Received: from mail-wm1-x32a.google.com ([2a00:1450:4864:20::32a]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1wPbkm-00000000SIx-1cQz for pgsql-hackers@lists.postgresql.org; Wed, 20 May 2026 07:54:10 +0000 Received: by mail-wm1-x32a.google.com with SMTP id 5b1f17b1804b1-48e6db3ff7eso22863585e9.0 for ; Wed, 20 May 2026 00:54:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cybertec.at; s=google; t=1779263647; x=1779868447; darn=lists.postgresql.org; h=message-id:date:content-id:mime-version:comments:references :in-reply-to:subject:cc:to:from:from:to:cc:subject:date:message-id :reply-to; bh=9zjvBQ+tEFaKeDcOj7ce0qTCpevdMlveMVkUazh4U9Y=; b=n+yI8VDLhdAczcBEYcOUXCjygzJsoXpQmoJKzLjBiJJJwRK9stcI0oZHyXjEZrnR+Q hUKkvvlyR05OEGoB4OqY2XBlhTg1KwL1AW+g6X1z743mPoHLtavUwb/RNcl4T+U1iKh+ TKxlYAtDGGJxw9c53NjxDzXzKn+fJAIw06x7qRqqkb9XJFFuEDaNNUV801OI/HQDy211 vULIJgXEvwM3IJJA+ZsmjBnY4MHXbn00Yh8T8WWzmHaDxzVVe6WPu0jJjHhuCKjp5jaX CuSCiZDoPIdCBWpD3wV882n2Z3nLR0hEUYW8NwcsRPvk9kjAf0lb2ZWzz50+45CKMhUC t8IQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779263647; x=1779868447; h=message-id:date:content-id:mime-version:comments:references :in-reply-to:subject:cc:to:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=9zjvBQ+tEFaKeDcOj7ce0qTCpevdMlveMVkUazh4U9Y=; b=GEZlYZrG8knxRVLrDfSHg1UwiPmXGpilGnIXX5D40WNehwGWVmW+ZvmpnM9KAiVOqz X3P/HjiSlKXp+JqcSfjKVAnGuzzxjhsh/GbNtWYKTGdl7DDY0LHPMxNvfZ/AkEkPqliJ xLUrT+A3sPdJhZUcb4HPhLjfXYgCMNQs5ik14R+wal1hPOjN8bZk8+1z5gVkkt+xyXLb LJVSQaa5+H0S1C7/Qgm4ha+PetnMsyQyVmblBkpbdcxI9OhqvjmhSNocf/fjdPsDdNk2 0yrNiZW9cw+FblGhiQ8y8cBws89iGdMeOYid04tQS0ZO4YqwLq7svbCk3nNuPt8yNd/G wm3w== X-Forwarded-Encrypted: i=1; AFNElJ/y0Q+oxf9p/lxF3Ex1LDDmEYhjVq6DkOMnBXRlZ49LzCvME3us9mhJy3bwRdneYeBRo1FAP5E5EG/CW0ak@lists.postgresql.org X-Gm-Message-State: AOJu0YwIJ17QLSE+METpDWW8sBltCqWvYq06Ks6JJuYM0MWa8qsnOtW2 ERE+uHu8/Ah0sCzs6droxy5QIwITQrtao4EIDb1sJug5Fdo9Y1+alZxEENkI2y7IEEI= X-Gm-Gg: Acq92OHqJ75V3cXpGdntBNQEnHfpsd6GNRlgNqt79HGJo984HG9kw08ZXCEOmZ0T5zj FA8BMgpLIn5rZB54htibRgp8WGdN0kX8YnJvilHKZb0nzZsEdYJ4i8Hx08sxh+xYFFkutn0DZT/ GIe+cxqFB+bvX0fs/v3jGEpfMXas7PXhW+vccLNxVC5ThQhOr6YA230FWm5d0nM1iNdfDZ9A7Vm m94lnEKX2KdrmHPp65tat5OIJ/MwS+HLkrWcfg4flyNo6slyzYJbhJYZ6ZwkC3ECKlmcCSNZO35 XWowpJd7AmHVJZGJ1j/uWuxKIkAlPXdVtId9ILD11WjReq4mwtJv0pb8qFalgeOANI4wmIJqNkn L8vPG3UVe3d1qYasVwXe7554wLIhepNrynYaN9uwcYU5d+0ZYGMnLKANz02qXHZRuK2wORGeHmv 9tOtSBRv2N+WL8Hw1VmBG8fpMfvP+VJmOX+cGy X-Received: by 2002:a05:600c:a406:b0:489:1aed:1658 with SMTP id 5b1f17b1804b1-48fe632a9f4mr272410235e9.23.1779263647492; Wed, 20 May 2026 00:54:07 -0700 (PDT) Received: from localhost (109-81-168-142.rct.o2.cz. [109.81.168.142]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-4900c16c62dsm201711685e9.11.2026.05.20.00.54.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 May 2026 00:54:07 -0700 (PDT) From: Antonin Houska To: Alvaro Herrera cc: Baji Shaik , pgsql-hackers@lists.postgresql.org Subject: Re: [PATCH] Fix REPACK decoding worker not cleaned up on FATAL exit In-reply-to: References: Comments: In-reply-to Alvaro Herrera message dated "Tue, 19 May 2026 11:45:05 -0700." X-Mailer: MH-E 8.6+git; nmh 1.8; GNU Emacs 28.3 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <8655.1779263646.1@localhost> Date: Wed, 20 May 2026 09:54:06 +0200 Message-ID: <8656.1779263646@localhost> List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Alvaro Herrera wrote: > On 2026-May-17, Baji Shaik wrote: > > > v3 uses PG_ENSURE_ERROR_CLEANUP which: > > - Handles both ERROR and FATAL exits > > - Automatically cancels the callback on normal completion > > (no slot leak) > > - Runs before memory contexts are destroyed (no use-after-free) > > Yeah, looks good. I have pushed it, with some comment wordsmithing and > other cosmetic changes. > > While looking at it, I realized that I didn't like the way > stop_repack_decoding_worker() works, mainly because if there's no > handle, we leak everything else -- and the way we initialize things > means we leak the shared memory segment. This is maybe a rare case and > just a small memory leak, but it seems better to do it nicely. So > here's a followup patch that reworks that code. This also forced me to > understand more clearly what is going on, so I rewrote the comments. The call of shm_mq_detach() got lost, or do you rely on dsm_detach() to call shm_mq_detach_callback() ? The latter does not free ->mqh_buffer. Since each REPACK runs in a separate transaction, I wouldn't consider that a leak, but I still think that explicit call of shm_mq_detach() makes the code a bit easier to read (i.e. no need for the developer to check if the detaching happens automatically). /* - * If we could not cancel the current sleep due to ERROR, do that before - * we detach from the shared memory the condition variable is located in. - * If we did not, the bgworker ERROR handling code would try and fail - * badly. + * Now detach from our shared memory segment. In error cases there might + * still be messages from the worker in the queue, which ProcessInterrupts + * would try to read; this is pointless (and causes an assertion failure), + * so set the global pointer to NULL to have ProcessRepackMessages ignore + * them. */ - ConditionVariableCancelSleep(); - - dsm_detach(decoding_worker->seg); + dsmseg = decoding_worker->seg; pfree(decoding_worker); decoding_worker = NULL; + + /* We must also cancel the current sleep, if one is still set up */ + ConditionVariableCancelSleep(); + + if (dsmseg != NULL) + dsm_detach(dsmseg); I suppose the reason for the assertion failure was reading from the queue after the backend had detached from it? Thanks for fixing that. -- Antonin Houska Web: https://www.cybertec-postgresql.com