Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1vE3Ul-003voQ-NF for pgsql-hackers@arkaria.postgresql.org; Wed, 29 Oct 2025 10:33:35 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1vE3Uk-00HNlx-My for pgsql-hackers@arkaria.postgresql.org; Wed, 29 Oct 2025 10:33:33 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1vE3Uk-00HNlo-4D for pgsql-hackers@lists.postgresql.org; Wed, 29 Oct 2025 10:33:33 +0000 Received: from fhigh-b1-smtp.messagingengine.com ([202.12.124.152]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vE3Ug-004rRJ-2i for pgsql-hackers@postgresql.org; Wed, 29 Oct 2025 10:33:32 +0000 Received: from phl-compute-05.internal (phl-compute-05.internal [10.202.2.45]) by mailfhigh.stl.internal (Postfix) with ESMTP id 413977A018B for ; Wed, 29 Oct 2025 06:33:28 -0400 (EDT) Received: from phl-imap-03 ([10.202.2.93]) by phl-compute-05.internal (MEProxy); Wed, 29 Oct 2025 06:33:28 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=compiler.org; h= cc:content-transfer-encoding:content-type:content-type:date:date :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to; s=fm3; t=1761734008; x=1761820408; bh=3S6nrwUZ32IbSutbKh57ZLXFtSykG8vHpBbC/2j3DKE=; b= NJKtdZ2BrBIaelLMKBAhfu94IYZY9HzWF8kdBuhrZJUCIG3BzU3ZGXBrvA0ohGyU NOHF9SFicvx389eZkBVZNU9MTaQvSMN9aJqil3HoKwvNLZlFcyISbm/Z6UxBA94u dbrBiwUILXqqsvnwC0Nqs9SypXRZueH7cmqoDiBnrHuk4DPLmX7Z0QkDravE0xl/ CVBrKKEky3bL/mgPQlEJm+c3rYWRPQwvF5oQc98kGzC337ulYHUhIH1p1aGp9LGW ZW6dm1idOfNRTbQ5PGNC+vQRNj4YJNeRdr378XXZHKtQPYNFk1n0qNqmQRJ3Mdsf qkmjZKKYkrqwsDv0lnn+MQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :content-type:date:date:feedback-id:feedback-id:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:subject:subject:to:to:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm3; t=1761734008; x=1761820408; bh=3 S6nrwUZ32IbSutbKh57ZLXFtSykG8vHpBbC/2j3DKE=; b=R/U7y2dTfLSR1AkPv lVTbSMwo7kKNuY/dk571eLlO+Su8Fb5WVlaSNDfNu2cMS3zkU5bH8aodIQxmpipQ P7wVV7brgAGMY8aHB+7/KJARJx5m0Dz5HCzVGhuS3O0OPuPy9TTgoIg/apqWrt8a Fi1ZNwDPcmHR12LI3Gy6ZNBedOb5ZI2C7QzxMzhxOQ36fzU+zYJMHttYd6td9fTK d4PDMUX2fwVOKywm1TAfRrhHlW0vDvnOvR4+WnxzA6igLsic9Jcu0shVywmFurc+ SrhEfqZ986j7xDjPqzfXeQ8QentN2Rg1cdAuM2YcPQtpQbp/Ls0fgTRxHsNtHwbi I1v4Q== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeffedrtdeggdduieefhedtucetufdoteggodetrf dotffvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfurfetoffkrfgpnffqhgenuceu rghilhhouhhtmecufedttdenucenucfjughrpefoggffhffvkfgjfhfutgfgsehtqhertd ertdejnecuhfhrohhmpedflfhovghlucflrggtohgsshhonhdfuceojhhovghlsegtohhm phhilhgvrhdrohhrgheqnecuggftrfgrthhtvghrnhepvdefleelteelveffkeeukeelje evudevfeelffduudetveeiffdttefhgefgueevnecuvehluhhsthgvrhfuihiivgeptden ucfrrghrrghmpehmrghilhhfrhhomhepjhhovghlsegtohhmphhilhgvrhdrohhrghdpnh gspghrtghpthhtohepuddpmhhouggvpehsmhhtphhouhhtpdhrtghpthhtohepphhgshhq lhdqhhgrtghkvghrshesphhoshhtghhrvghsqhhlrdhorhhg X-ME-Proxy: Feedback-ID: ic6394509:Fastmail Received: by mailuser.phl.internal (Postfix, from userid 501) id 8D36718E0069; Wed, 29 Oct 2025 06:33:27 -0400 (EDT) X-Mailer: MessagingEngine.com Webmail Interface MIME-Version: 1.0 X-ThreadId: AE1r89ybsZ1g Date: Wed, 29 Oct 2025 11:33:07 +0100 From: "Joel Jacobson" To: pgsql-hackers Message-Id: <7556f0d4-03fd-451a-bd34-5f62b424319a@app.fastmail.com> In-Reply-To: References: <6899c044-4a82-49be-8117-e6f669765f7e@app.fastmail.com> <165530.1752362320@sss.pgh.pa.us> <02a7cd37-e2fc-4212-8b19-f8c239c95fb8@app.fastmail.com> <96f00bf1-cc9d-4520-9d02-9e14e7767c88@app.fastmail.com> <30c2aa7d-dd6c-4b68-a2e4-f217a1a34acf@app.fastmail.com> <0b4d402a-9ac2-4aa8-acf8-8231dbe579ea@app.fastmail.com> <3095599.1758644879@sss.pgh.pa.us> <0dc6a2cc-5216-4dc1-9dd2-430cafc6095b@app.fastmail.com> <52CC167F-763B-4ECA-B0B4-DAB381816828@gmail.com> <9186C6D0-F7A9-482A-9183-89E530B57E36@gmail.com> <1073593.1759423179@sss.pgh.pa.us> <4bd5e6c4-6fa7-44bb-869d-59a32a331fa8@app.fastmail.com> <85828f29-e72e-4400-94f3-9a69bc8dc239@app.fastmail.com> <2495353.1759860890@sss.pgh.pa.us> <8aeae418-92a6-4bbd-9c06-9574c79e59f7@app.fastmail.com> <2531672.1759868124@sss.pgh.pa.us> <474efa78-337c-41cd-a73a-f845a0115109@app.fastmail.com> <2749343.1759949176@sss.pgh.pa.us> <8bfca2be-1ec0-4e15-aafb-0b7b661fe936@app.fastmail.com> <9eba307f-f2fb-48f0-9507-2e197f39ef9e@app.fastmail.com> <8c71183a-0d28-4bcf-a806-78446ff95404@app.fastmail.com> <1009807.1760476747@sss.pgh.pa.us> <1F7227F5-C33D-4E2C-8511-33F1468590D0@gmail.com> <0a5a20d3-4621-46b3-b2ab-903f63a20dea@app.fastmail.com> <6F913129-ABEF-4004-AAF3-F22FC34!29AE8@gmail.com> <1547585.1760645808@sss.pgh.pa.us> <14865EB6-0BF4-462B-9072-10BDAC10C052@gmail.com> <0BCA1C2D-B92C-459E-B1A6-6D06BA4C62CF@gmail.com> <55d24cbb-e9ef-491f-a99b-b3dbd7cecdf9@app.fastmail.com> <38574cad-e90d-47b7-a015-753bb6bbc360@app.fastmail.com> <66631FB7-5BEA-4ED5-A694-9AD8B9CCFEE8@gmail.com> <4b7b49a5-5e1a-44a8-93e0-60457d15cb1d@app.fastmail.com> <82DEA2B6-6FC5-4A79-BDE3-1FD72F104A6E@gmail.com> <38de1036-d8cf-420c-b845-edb5a946b191@app.fastmail.com> <87E40BF8-8877-4DBD-9040-99AF8A4E6358@gmail.com> Subject: Re: Optimize LISTEN/NOTIFY Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Wed, Oct 29, 2025, at 08:05, Chao Li wrote: >> On Oct 29, 2025, at 05:45, Joel Jacobson wrote: >> I found a concurrency bug in v21 that could cause missed wakeup when a >> backend would UNLISTEN on the last channel, which called >> asyncQueueUnregister, and if wakeupPending was at that time already s= et, >> then it wouldn't get reset, since in ProcessIncomingNotify we return >> early if (listenChannels =3D=3D NIL), so we would never clear wakeupP= ending >> which happens in asyncQueueReadAllNotifications. >>=20 >> Fixed by clearing wakeupPending in asyncQueueUnregister: >>=20 >> @@ -1597,6 +1597,7 @@ asyncQueueUnregister(void) >> /* Mark our entry as invalid */ >> QUEUE_BACKEND_PID(MyProcNumber) =3D InvalidPid; >> QUEUE_BACKEND_DBOID(MyProcNumber) =3D InvalidOid; >> + QUEUE_BACKEND_WAKEUP_PENDING(MyProcNumber) =3D false; >> /* and remove it from the list */ >> if (QUEUE_FIRST_LISTENER =3D=3D MyProcNumber) >> QUEUE_FIRST_LISTENER =3D QUEUE_NEXT_LISTENER(MyProcNumber); >>=20 >> /Joel<0001-optimize_listen_notify-v22.patch><0002-optimize_listen_not= ify-v22.patch> > > I think the current implementation still has a race problem. > > Let=E2=80=99s say notifier N1 notifies listener=E2=80=99s L1 to read m= essage. > L1 starts to read: it acquires the look, gets reading range, then=20 > releases the lock, start performs reading without holding the lock. > Notifier N2 comes, N2 doesn=E2=80=99t have anything L1 is interested i= n. N2 now=20 > holds the look, when it checks "if (QUEUE_POS_EQUAL(pos,=20 > queueHeadBeforeWrite))=E2=80=9D, here comes the race. Because the lock= is in=20 > N2=E2=80=99s hand, L1 cannot get the lock to update its pos, so "if=20 > (QUEUE_POS_EQUAL(pos, queueHeadBeforeWrite))=E2=80=9D will not be sati= sfied, so=20 > direct advancement won=E2=80=99t happen. I'm not sure I agree that qualifies as a race "problem" per se, since I think that just sounds like a case where we would do an unnecessary wakeup, right? Without more sophisticated data structures (e.g. skip ranges) and increased code complexity, there will always be cases where we will by do unnecessary wakeups, which IMO need not be a design goal to completely avoid, until we have benchmark data that indicates otherwise. I think we should iterate by first trying to reason about correctness of the code, trying to prove/disprove if a notifications could ever end up not being delivered. The bug I fixed in v22 is an example of such a case, that would cause a listening backend to never be awaken, since notifiers would not signal it due to the pending wake that was not cleared. I wonder if there could be more such serious bugs in the current code. I will focus my efforts now trying to answer that question. Would be really nice if we could find a way to reason formally about this. I've been looking into the P programming language, which seems suitable for modeling and verifying these kind of asynchronous concurrency protocols, I will give it a try. /Joel