Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vNCLZ-00Djbs-0l for pgsql-hackers@arkaria.postgresql.org; Sun, 23 Nov 2025 15:49:53 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vNCLX-00Ewnl-1l for pgsql-hackers@arkaria.postgresql.org; Sun, 23 Nov 2025 15:49:51 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vNCLW-00Ewnd-1q for pgsql-hackers@lists.postgresql.org; Sun, 23 Nov 2025 15:49:51 +0000 Received: from fhigh-a6-smtp.messagingengine.com ([103.168.172.157]) by makus.postgresql.org with smtp (Exim 4.96) (envelope-from ) id 1vNCLU-0013sY-0H for pgsql-hackers@postgresql.org; Sun, 23 Nov 2025 15:49:50 +0000 Received: from phl-compute-05.internal (phl-compute-05.internal [10.202.2.45]) by mailfhigh.phl.internal (Postfix) with ESMTP id 77F4B140013E; Sun, 23 Nov 2025 10:49:47 -0500 (EST) Received: from phl-imap-03 ([10.202.2.93]) by phl-compute-05.internal (MEProxy); Sun, 23 Nov 2025 10:49:47 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=compiler.org; h= cc:cc:content-transfer-encoding:content-type:content-type:date :date:from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to; s=fm3; t=1763912987; x=1763999387; bh=pvcTdFydxPkUSiJ5PPFR04mvpDcdhJICQNLJac9/3mU=; b= Wgo7zieFmTSUEv6+K83c5l+okSlENrl6GqjqGShSsdbf6UWKndL3YWYnCFnTXK6g 4+E4Y8Ft1tuQFfpmlQNmEkYzbYBIJ7Yl1kQX4ttwhYqoT59bPZSPi+4IaSkZhWxr gAPqk4arAfySvAnvKuT9bct1GN/HkNyYBLQgnGTqA6wP+7H0J2xCWq7Ix1hOhpL5 9H0X8EXhKN3gmF4TRYs8krjdf9//qi6rjyTa/jK8fLh8UKOaCev99aMRbCqLPrhT ylB6vTV+El6MmWX1fjsFv/EB95JNzNRn70J6xUx4PcOu5kIZgn3G6Dq8sT4/g69A WnTEoJRCGQI5b1DpbdxtCA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:content-type:date:date:feedback-id:feedback-id :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm3; t=1763912987; x= 1763999387; bh=pvcTdFydxPkUSiJ5PPFR04mvpDcdhJICQNLJac9/3mU=; b=d XgjwlCS3vWRLN0n1fogUpia4sYCYfkuBI0s0Qcchq+O0S8kLYTiXTzak8/CbJ71d 0Zc/JTSuvVlzt1S9F/lviZCSYr5oSdXqgrUUSID0GuU/AvHXq0aNWlQqP+566TJS 7f7vVdejeIjBYKJeRYMjkiFyHjdiRAKOKBamkOxgs+1ngznonvEzUk9VbVXmuyKp p9Npzew5+ZlkoPeB+3vwg39PCNiHrinH2ExSRdpHBswxIaDrnnltk35PfyDziGp4 iLUad3u7NSHJoV5rysMJLMeIrYBqdOUGAaaae5N3b6sDE3evewQStOhqQeh/oRP+ RnUtxJvf+LJtvsVFUvYFQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeffedrtdeggddvfeeiuddvucetufdoteggodetrf dotffvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfurfetoffkrfgpnffqhgenuceu rghilhhouhhtmecufedttdenucenucfjughrpefoggffhffvvefkjghfufgtgfesthejre dtredttdenucfhrhhomhepfdflohgvlhculfgrtghosghsohhnfdcuoehjohgvlhestgho mhhpihhlvghrrdhorhhgqeenucggtffrrghtthgvrhhnpeevieegkefhhefgvefggeduhf ffieffveejgfegjeffffeitdfhveeguefguedttdenucffohhmrghinhepphhoshhtghhr vghsqhhlrdhorhhgnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilh hfrhhomhepjhhovghlsegtohhmphhilhgvrhdrohhrghdpnhgspghrtghpthhtohepfedp mhhouggvpehsmhhtphhouhhtpdhrtghpthhtoheplhhirdgvvhgrnhdrtghhrghosehgmh grihhlrdgtohhmpdhrtghpthhtohepphhgshhqlhdqhhgrtghkvghrshesphhoshhtghhr vghsqhhlrdhorhhgpdhrtghpthhtohepthhglhesshhsshdrphhghhdrphgrrdhush X-ME-Proxy: Feedback-ID: ic6394509:Fastmail Received: by mailuser.phl.internal (Postfix, from userid 501) id 0EC4118E0069; Sun, 23 Nov 2025 10:49:47 -0500 (EST) X-Mailer: MessagingEngine.com Webmail Interface MIME-Version: 1.0 X-ThreadId: AE1r89ybsZ1g Date: Sun, 23 Nov 2025 16:49:26 +0100 From: "Joel Jacobson" To: "Tom Lane" Cc: "Chao Li" , pgsql-hackers Message-Id: <3a919670-037b-4073-be33-745ca69a6232@app.fastmail.com> In-Reply-To: <897cb48e-fa39-4ca6-a29f-ec3bb6ce99b1@app.fastmail.com> References: <6899c044-4a82-49be-8117-e6f669765f7e@app.fastmail.com> <165530.1752362320@sss.pgh.pa.us> <02a7cd37-e2fc-4212-8b19-f8c239c95fb8@app.fastmail.com> <96f00bf1-cc9d-4520-9d02-9e14e7767c88@app.fastmail.com> <30c2aa7d-dd6c-4b68-a2e4-f217a1a34acf@app.fastmail.com> <0b4d402a-9ac2-4aa8-acf8-8231dbe579ea@app.fastmail.com> <3095599.1758644879@sss.pgh.pa.us> <0dc6a2cc-5216-4dc1-9dd2-430cafc6095b@app.fastmail.com> <52CC167F-763B-4ECA-B0B4-DAB381816828@gmail.com> <9186C6D0-F7A9-482A-9183-89E530B57E36@gmail.com> <1073593.1759423179@sss.pgh.pa.us> <4bd5e6c4-6fa7-44bb-869d-59a32a331fa8@app.fastmail.com> <85828f29-e72e-4400-94f3-9a69bc8dc239@app.fastmail.com> <2495353.1759860890@sss.pgh.pa.us> <8aeae418-92a6-4bbd-9c06-9574c79e59f7@app.fastmail.com> <2531672.1759868124@sss.pgh.pa.us> <474efa78-337c-41cd-a73a-f845a0115109@app.fastmail.com> <2749343.1759949176@sss.pgh.pa.us> <8bfca2be-1ec0-4e15-aafb-0b7b661fe936@app.fastmail.com> <9eba307f-f2fb-48f0-9507-2e197f39ef9e@app.fastmail.com> <8c71183a-0d28-4bcf-a806-78446ff95404@app.fastmail.com> <1009807.1760476747@sss.pgh.pa.us> <1F7227F5-C33D-4E2C-8511-33F1468590D0@gmail.com> <0a5a20d3-4621-46b3-b2ab-903f63a20dea@app.fastmail.com> <6F913129-ABEF-4004-AAF3-F22FC3!4!29AE8@gmail.com> <1547585.1760645808@sss.pgh.pa.us> <14865EB6-0BF4-462B-9072-10BDAC10C0> <2067239.1763670416@sss.pgh.pa.us> <897cb48e-fa39-4ca6-a29f-ec3bb6ce99b1@app.fastmail.com> Subject: Re: Optimize LISTEN/NOTIFY Content-Type: text/plain Content-Transfer-Encoding: 7bit List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Sat, Nov 22, 2025, at 22:30, Joel Jacobson wrote: > On Thu, Nov 20, 2025, at 21:26, Tom Lane wrote: >> I took a brief look through the v28 patch, and I'm fairly distressed >> at how much new logic has been stuffed into what's effectively a >> critical section. It's totally not okay for AtCommit_Notify or >> anything it calls to throw an error > > Right, I agree. Thanks for guidance. > >> So I think there needs to be a serious effort made to move as >> much as we possibly can of the potentially-risky stuff into >> PreCommit_Notify. In particular I think we ought to create >> the shared channel hash entry then, and even insert our PID >> into it. We could expand the listenersArray entries to include >> both a PID and a boolean "is it REALLY listening?", and then >> during Exec_ListenCommit we'd only be required to find an >> entry we already added and set its boolean, so there's no OOM >> hazard. Possibly do something similar with the local >> listenChannelsHash, so as to remove that possibility of OOM >> failure as well. > > Thanks for the idea, I like this approach. I've expanded the > listenersArray like suggested, and moved all risky stuff from > Exec_ListenCommit to PreCommit_Notify. > >> (An alternative design could be to go ahead and insert our >> PID during PreCommit_Notify, and just tolerate the small >> risk of getting signaled when we didn't need to be. But >> then we'd need some mechanism for cleaning out the bogus >> entry during AtAbort_Notify.) > > We seem to need a cleanup mechanism also with the boolean field design, > since a channel could end up being added only to listenChannelsHash, but > not channelHash, which would trick IsListeningOn() into falsely thinking > we're listening on such channel when we're not. This could happen if > successfully adding the channel to listenChannelsHash, but OOM when > trying to add it to channelHash. > > AtAbort_Notify now handles such half-state, by reconciling all channels > that had LISTEN_LISTEN actions, using the channelHash as the single > source of truth, removing channels from both listenChannelsHash and > channelHash, unless the active field is true (which means we were > already listening to the channel due to a previous transaction). > > I've tested triggering the cleanup logic by adding elog ERROR that > triggered after listenChannelsHash insert, and another test that > triggered after channelHash insert, and it cleaned it up correctly. I > haven't created a test for it in tree though, would we want that? > >> I'm not sure what I think about the new logic in SignalBackends >> from this standpoint. Making it very-low-probability-of-error >> definitely needs some consideration though. > > The initChannelHash call in SignalBackends is now gone, moved to > PreCommit_Notify (called if there are any pendingNotifies). > > I also took the liberty of fixing the XXX comment, to lazily preallocate > the signals arrays during PreCommit_Notify. It felt too inconsistent to > just leave that unfixed, but maybe should be a separate commit? I've extracted the preallocation of signals arrays into a separate patch: https://commitfest.postgresql.org/patch/6248/ > I wonder how risky the remaining new logic in SignalBackends is. For > instance, I looked at dshash_find(..., false), and note it calls > LWLockAcquire which in turn could elog ERROR if num locks is exceeded, > but in master we're already calling LWLockAcquire in SignalBackends, so > should be fine I guess? > > Apart from that, I don't see any new logic in SignalBackends, that could > potentially be risky. /Joel