Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1v33iT-00AeeJ-Aq for pgsql-hackers@arkaria.postgresql.org; Mon, 29 Sep 2025 02:34:17 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1v33iR-00GNRg-7y for pgsql-hackers@arkaria.postgresql.org; Mon, 29 Sep 2025 02:34:15 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1v33iQ-00GNRY-LZ for pgsql-hackers@lists.postgresql.org; Mon, 29 Sep 2025 02:34:15 +0000 Received: from mail-pg1-x529.google.com ([2607:f8b0:4864:20::529]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1v33iO-000Qrg-19 for pgsql-hackers@postgresql.org; Mon, 29 Sep 2025 02:34:14 +0000 Received: by mail-pg1-x529.google.com with SMTP id 41be03b00d2f7-b593def09e3so1259423a12.2 for ; Sun, 28 Sep 2025 19:34:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1759113252; x=1759718052; darn=postgresql.org; h=references:to:cc:in-reply-to:date:subject:mime-version:message-id :from:from:to:cc:subject:date:message-id:reply-to; bh=s7rdM89xuRhtVzIYYcbAJLi9hQiPQi1jbpUQ+IgtfF4=; b=PlewjHecra5Z86a3GOhSW400/cKp9LQENizGVQdhaZaW3a25KqT80RCIpMoYKCkazy KLAjXGZaxoLAbcxjbyPZ39Hu8NimWgdbNP0ZvRjsQ19/A8oPuAVePjWenvnvu7aEzf80 thYsXRvQyLXlSaPkVssxG9rw60PttB6fzOZHLq3YTsvIYdDj8pcv3vJEjnIqHMwLGgWn cB0YSQ14JTYuiewAuEAeeO49OnGhHqEPlpKRX2zYxEzk/y6uinSwc1hujDkjX4IZ7i2P YxzZu/lePzMBL/RiIrFSKUoWXP8g143IlytoU/gcUdyq0YelVXOmVIRDf82H0ch7y2mq cjHQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759113252; x=1759718052; h=references:to:cc:in-reply-to:date:subject:mime-version:message-id :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=s7rdM89xuRhtVzIYYcbAJLi9hQiPQi1jbpUQ+IgtfF4=; b=Jp8XBtZj59uHzCRWK6JSSUMKmfixYMGo8EB5fWwwaCr1kWLo9jf/+pABoaLtmPnsjl Wr04nZdrFrNNWsI5DjWt/UgKtqFIgWrZo/ZDvSBokc1QW77y+oqa2QrZOGsNhtWaG0+p dho3KZ0/2zJThzwYe+vklLeL4xqifRb21pv+O6IKAl+JOikyaWKtxRM3rv+felIu9tZl PlaLqPOqV/572flyHOcIdUT6b3/MwvmA4AC+jrVCVFYNHSeOG3LMDA184LTSgeMbvwzI Z9UDTAw3lWOCJ/yDChVQKpZIRX24iOzeTF7Un4t5vWmUjRPlTBSPucko+ioU3pKUp+nm QirA== X-Forwarded-Encrypted: i=1; AJvYcCU4P3NcwCTwDzgQ+efSr2e0fjOP1GRaTclSGonPJ7zFxR0PBp1NRQPcy4GvoxJqKMFn2EwQ5FuftK6yri40@postgresql.org X-Gm-Message-State: AOJu0YxUKD4HNchRxh5cSsfg5Ytw9vO66RP9jk1KvNbCfp033n+PmlyP IplNYFJfDufvsW2p+NyO3KZkPWyzv5YTzdayRozjERR8ZJjOca5evN5d X-Gm-Gg: ASbGncuYFpDM/RGSr7l5ESxHEd3masUtFTllZRl4iR2nCdDdreXOT9pr+29Xyo3aMcB hTz4QWEmPKHP2q9qYTlOpk/Tn/UT8mOddob9g3WxJyHePLsx1e9hqwN4H0vPVE/GefJyIbMGlw6 U4HRJA0aFIviaI7pnL2wVBtJ4HfG6KeDSBRNDZ9tydi4NQ04zquQCGoIKw+f8pvBiKy/px1Tnb9 eeLx/FC8Goc36UiOcUSH6yVhiYxsTN49ih+vZHjnD6TBtV5/iO1wQ0Glp3BCttaMYtW+yh7kK1S X7KUWEgyyc/BivFz2S6Y7oVx6G8/6rCQOq4h6YknN6vG9CBw8lJx4hlwHAk2Jfns/ZD2Dr0so0e 5BzFavmQvuI7hc4r281hdnpCF9vtJ8w6w X-Google-Smtp-Source: AGHT+IFKhW8e4lmf7uPOy4QNREFNV0RuQ1ydbBXqMuTvqVmV5GoiqWkSAmC/IYPKW6T+Z9/lNZQSwg== X-Received: by 2002:a17:903:8d0:b0:269:8d65:40bd with SMTP id d9443c01a7336-27ed4a425ddmr167426755ad.40.1759113251936; Sun, 28 Sep 2025 19:34:11 -0700 (PDT) Received: from smtpclient.apple ([142.171.105.12]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-27ed66d2ffdsm115801885ad.18.2025.09.28.19.34.08 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sun, 28 Sep 2025 19:34:11 -0700 (PDT) From: Chao Li Message-Id: Content-Type: multipart/alternative; boundary="Apple-Mail=_B07F0BE7-6F47-415A-92E9-75F18058B68F" Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3826.700.81\)) Subject: Re: Optimize LISTEN/NOTIFY Date: Mon, 29 Sep 2025 10:33:54 +0800 In-Reply-To: Cc: Tom Lane , Thomas Munro , pgsql-hackers , Heikki Linnakangas , Rishu Bagga To: Joel Jacobson References: <6899c044-4a82-49be-8117-e6f669765f7e@app.fastmail.com> <165530.1752362320@sss.pgh.pa.us> <02a7cd37-e2fc-4212-8b19-f8c239c95fb8@app.fastmail.com> <96f00bf1-cc9d-4520-9d02-9e14e7767c88@app.fastmail.com> <30c2aa7d-dd6c-4b68-a2e4-f217a1a34acf@app.fastmail.com> <0b4d402a-9ac2-4aa8-acf8-8231dbe579ea@app.fastmail.com> <3095599.1758644879@sss.pgh.pa.us> <0dc6a2cc-5216-4dc1-9dd2-430cafc6095b@app.fastmail.com> <52CC167F-763B-4ECA-B0B4-DAB381816828@gmail.com> <9186C6D0-F7A9-482A-9183-89E530B57E36@gmail.com> X-Mailer: Apple Mail (2.3826.700.81) List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --Apple-Mail=_B07F0BE7-6F47-415A-92E9-75F18058B68F Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > On Sep 28, 2025, at 18:24, Joel Jacobson wrote: >=20 >>=20 >> I might miss the factor of holding an exclusive lock. I will revisit=20= >> that part again. >=20 > I've re-read this entire thread, and I actually think my original > approaches are more promising, that is, the > 0001-optimize_listen_notify-v4.patch patch, doing multicast targeted > signaling. >=20 > Therefore, merely consider the latest patch as PoC with some possible > interesting ideas. >=20 > Before this patch, I had never used PostgreSQL's timeout mechanism > before, so I didn't consider it when thinking about how to solve the > remaining problems with 0001-optimize_listen_notify-v4.patch, which > currently can't guarantee that all listening backends will eventually > catch up, since it just kicks one of the most lagging ones, for each > notification. This could be a problem in practise if there is a long > period of time with no notifications coming in. Then some listening > backends could end up not being signaled and would stay behind, > preventing the queue tail from advancing. >=20 > I'm thinking maybe somehow we can use the timeout mechanism here, but > I'm not sure how yet. Any ideas? >=20 > /Joel Hi Joel, I never had a concern about using the timeout mechanism. My comment was = about enabling timeout duplicately. I just revisited the code, now I agree that I was over-worried because I = missed considering NotifyQueueLock. With the lock protection, a backend = process=E2=80=99 QUEUE_BACKEND_WAKEUP_PENDING_FLAG won=E2=80=99t have = race condition, then it should have no duplicate signals sending to the = same backend process. Then in the backend process, you have = =E2=80=9Clast_wakeup_start_time=E2=80=9D that avoids duplicate timeout = within a configured period, and you reset last_wakeup_start_time in = asyncQueueReadAllNotifications() together with cleaning the = QUEUE_BACKEND_WAKEUP_PENDING_FLAG. So, overall v2 looks good to me. One last tiny comment is about naming of last_wakeup_start_time. I think = it can be renamed to =E2=80=9Clast_wakeup_time=E2=80=9D. Because the = variable just records when asyncQueueReadAllNotifications() last time = called, there seems not a meaning of =E2=80=9Cstart=E2=80=9D involved. Best regards, -- Chao Li (Evan) HighGo Software Co., Ltd. https://www.highgo.com/ --Apple-Mail=_B07F0BE7-6F47-415A-92E9-75F18058B68F Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8
On Sep 28, = 2025, at 18:24, Joel Jacobson <joel@compiler.org> wrote:


I might miss the factor of holding an = exclusive lock. I will revisit 
that part = again.

I've = re-read this entire thread, and I actually think my original
approaches are more promising, that is, = the
0001-optimize_listen_notify-v4.patch patch, doing multicast = targeted
signaling.

Therefore, merely consider the latest patch = as PoC with some possible
interesting ideas.

Before = this patch, I had never used PostgreSQL's timeout mechanism
before, so I didn't consider it when = thinking about how to solve the
remaining problems with = 0001-optimize_listen_notify-v4.patch, which
currently can't guarantee that all = listening backends will eventually
catch = up, since it just kicks one of the most lagging ones, for each
notification. This could be a problem in = practise if there is a long
period = of time with no notifications coming in. Then some listening
backends could end up not being signaled = and would stay behind,
preventing the queue tail from advancing.

I'm thinking maybe somehow we can use the = timeout mechanism here, but
I'm not = sure how yet. Any ideas?

/Joel

Hi = Joel,

I never had a concern about using the = timeout mechanism. My comment was about enabling timeout = duplicately.

I just revisited the code, now I = agree that I was over-worried because I missed considering = NotifyQueueLock. With the lock protection, a backend process=E2=80=99 = QUEUE_BACKEND_WAKEUP_PENDING_FLAG won=E2=80=99t have race condition, = then it should have no duplicate signals sending to the same backend = process. Then in the backend process, you have = =E2=80=9Clast_wakeup_start_time=E2=80=9D that avoids duplicate timeout = within a configured period, and you reset last_wakeup_start_time in = asyncQueueReadAllNotifications() together with cleaning the = QUEUE_BACKEND_WAKEUP_PENDING_FLAG.

So, overall = v2 looks good to me.

One last tiny comment is = about naming of last_wakeup_start_time. I think it can be renamed to = =E2=80=9Clast_wakeup_time=E2=80=9D. Because the variable just records = when asyncQueueReadAllNotifications() last time called, there seems not = a meaning of =E2=80=9Cstart=E2=80=9D involved.

Best = regards,
--
Chao Li (Evan)
HighGo Software = Co., Ltd.
https://www.highgo.com/




= --Apple-Mail=_B07F0BE7-6F47-415A-92E9-75F18058B68F--