Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1v6fCF-007vCE-E2 for pgsql-hackers@arkaria.postgresql.org; Thu, 09 Oct 2025 01:11:55 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1v6fCC-00E5pP-PX for pgsql-hackers@arkaria.postgresql.org; Thu, 09 Oct 2025 01:11:53 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1v6fCC-00E5pD-Da for pgsql-hackers@lists.postgresql.org; Thu, 09 Oct 2025 01:11:53 +0000 Received: from mail-pg1-x52a.google.com ([2607:f8b0:4864:20::52a]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1v6fCA-0018Kt-2o for pgsql-hackers@postgresql.org; Thu, 09 Oct 2025 01:11:53 +0000 Received: by mail-pg1-x52a.google.com with SMTP id 41be03b00d2f7-b57bf560703so243148a12.2 for ; Wed, 08 Oct 2025 18:11:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1759972308; x=1760577108; darn=postgresql.org; h=references:to:cc:in-reply-to:date:subject:mime-version:message-id :from:from:to:cc:subject:date:message-id:reply-to; bh=JDCliNR0DZ/6VziJFmQUVSe94e09HkSWJq78k3k7n8w=; b=H66cactDCfyvEl9ocj7kbKoqh0ufQic7TsxTlt774aFdWVMlv6uWOhmIAa+XCSAuJN kT0oJddAYRrphEZnLgshYXtdiSuBnTLGNHoi1xn0OipABGaCvMGKDrIKkhUiwGWFe8Py lwgnUlTHSXo9E4LCCYch7yyVdmK7Gmg6Qloox8yZdKLG3xo+k9c/2mg8wnW6Rrh+PjRS vN2a5RUWeKZp0C/EdV09AwKDwq8IUhHCVEx+gxzQtwMqmQVXccR546dFb8xcVMRECdSp jzjxtHgu72EfzCCUUSc7/nSOu5ND1nTzUufBetF15zQOfafuPJGtlYX0e5qv+Hrep6yA 8gAQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759972308; x=1760577108; h=references:to:cc:in-reply-to:date:subject:mime-version:message-id :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=JDCliNR0DZ/6VziJFmQUVSe94e09HkSWJq78k3k7n8w=; b=bc1spI63r9itQ3vmCS7QMTuvracmLBmGY1drJ5KIiHn9RCXl8IuVx5gGaVYgMNf3p1 DUy+0E2h3m6CyjtSCzOjBdLh2xfo2n40VtKUdE7a+SGyHS2txwSb0VeRGoVkuD/D2gq2 8SOwGKlwjJk3I66oCb89fl31E5PkfEH4xNPsQVrczx8UKn8I0bUuIh2hbeKNTjbCXTgX tAQN+9+Dmk+QZh1fxMF+yO9i+9kLz9YfaMPHSxR9dgOoUyMwMrHNJcV9hRdvOVzXXHMY 61mtVFTbqJ39CAMURVhrNbzCG4UE/ep8eE9oZl3bGPiJUKoGWIQd9chHDVptqVw6mt4C Fuug== X-Forwarded-Encrypted: i=1; AJvYcCWhQJJNms7tEG6jtoTkNIC/B3HicNuSqaSk9IpTI7EWmogsBjabXgzuzTg8exQhVDoX2+PCRs9zrs+dFjaK@postgresql.org X-Gm-Message-State: AOJu0Yy6zrXIciQ/BGGs1sAC17D0/lG+ISL0VjfpiSNfaqZOZWjqZu4n 4M05RZJG88wITb7NUVPzTD18SQgSfNAv3sOmDepAJiGhFl7xY4li6hMF X-Gm-Gg: ASbGnctObWeyF+4VrRzyqUQ9MrOmGTOlr3lqee6PDCbzrCrD3Ig+2PlPDHtucpHBpHS hJzE92FxcasxMuz0yt6rVIozZVzZraGXLpnicRz1+Qv1nHcCHtykpUrDf+duRnaeoKmtIiqW9Js qRYj6xDDadaJk1xpgTFF/dMIK5PhbqtcOwhEO5J52F/8iaJLnPPJnSmsxCd2872MFgbn1IEZc0T ywpPiugfjxJZlmjZtsdcw65CM1s+zgn69WID2QwO6AoQ3Axm4MuIf5T1VKfREkrHz6XDSHhoHte bT9JB0fbR706s66dTjoC8tMst0a1qNwdre6XXM2iZ3YMohgOsyy3FYas+GXVnOqpdR2Qr1wJCek 7xe0gh++e6ff42RoQO9Wf6lXDhaXsbvtAnwi8mjzkKOCpoN7VwlkPYLDOcg== X-Google-Smtp-Source: AGHT+IHD/+z8mjhfQdSi+oRENOx78TBMui8ex9KITLR++61vY9Gs1r/n9OWHPQsuqHFgUdxz1AiEFw== X-Received: by 2002:a17:902:ef09:b0:27d:6f37:7b66 with SMTP id d9443c01a7336-290273ffef7mr73362605ad.47.1759972308121; Wed, 08 Oct 2025 18:11:48 -0700 (PDT) Received: from smtpclient.apple ([170.178.170.211]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-29034f362fasm9956975ad.97.2025.10.08.18.11.45 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 08 Oct 2025 18:11:47 -0700 (PDT) From: Chao Li Message-Id: Content-Type: multipart/alternative; boundary="Apple-Mail=_26BAA13C-A12D-42A7-AF6A-9B19C26FDB54" Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3826.700.81\)) Subject: Re: Optimize LISTEN/NOTIFY Date: Thu, 9 Oct 2025 09:11:32 +0800 In-Reply-To: <2790345f-03c3-4cae-8f14-886ed9079319@app.fastmail.com> Cc: Tom Lane , Matheus Alcantara , pgsql-hackers To: Joel Jacobson References: <6899c044-4a82-49be-8117-e6f669765f7e@app.fastmail.com> <165530.1752362320@sss.pgh.pa.us> <02a7cd37-e2fc-4212-8b19-f8c239c95fb8@app.fastmail.com> <96f00bf1-cc9d-4520-9d02-9e14e7767c88@app.fastmail.com> <30c2aa7d-dd6c-4b68-a2e4-f217a1a34acf@app.fastmail.com> <0b4d402a-9ac2-4aa8-acf8-8231dbe579ea@app.fastmail.com> <3095599.1758644879@sss.pgh.pa.us> <0dc6a2cc-5216-4dc1-9dd2-430cafc6095b@app.fastmail.com> <52CC167F-763B-4ECA-B0B4-DAB381816828@gmail.com> <9186C6D0-F7A9-482A-9183-89E530B57E36@gmail.com> <1073593.1759423179@sss.pgh.pa.us> <4bd5e6c4-6fa7-44bb-869d-59a32a331fa8@app.fastmail.com> <85828f29-e72e-4400-94f3-9a69bc8dc239@app.fastmail.com> <2495353.1759860890@sss.pgh.pa.us> <8aeae418-92a6-4bbd-9c06-9574c79e59f7@app.fastmail.com> <2790345f-03c3-4cae-8f14-886ed9079319@app.fastmail.com> X-Mailer: Apple Mail (2.3826.700.81) List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --Apple-Mail=_26BAA13C-A12D-42A7-AF6A-9B19C26FDB54 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > On Oct 8, 2025, at 22:53, Joel Jacobson wrote: >=20 >> 1 >> ``` >> + channels =3D GetPendingNotifyChannels(); >> + >> LWLockAcquire(NotifyQueueLock, LW_EXCLUSIVE); >> - for (ProcNumber i =3D QUEUE_FIRST_LISTENER; i !=3D = INVALID_PROC_NUMBER; i=20 >> =3D QUEUE_NEXT_LISTENER(i)) >> + foreach(lc, channels) >> ``` >>=20 >> I don=E2=80=99t see where =E2=80=9Cchannels=E2=80=9D is freed. = GetPendingNotifyChannels()=20 >> creates a list of Nodes, both the list and Nodes the list points to=20= >> should be freed. >=20 > Per suggestion from Tom Lane I reverted back = GetPendingNotifyChannels(), > so this comment is not applicable any longer. I think you just reverted the usage of list_member() and makeNode(), but = returned =E2=80=9Cchannels=E2=80=9D is still built by =E2=80=9Clappend()=E2= =80=9D that allocates memory for the List structure. So you need to use = =E2=80=9Clist_free(channels)=E2=80=9D to free the memory. >> 5 >> ``` >> /* >> @@ -1865,6 +2087,7 @@ asyncQueueReadAllNotifications(void) >> LWLockAcquire(NotifyQueueLock, LW_SHARED); >> /* Assert checks that we have a valid state entry */ >> Assert(MyProcPid =3D=3D QUEUE_BACKEND_PID(MyProcNumber)); >> + QUEUE_BACKEND_WAKEUP_PENDING(MyProcNumber) =3D false; >> ``` >>=20 >> This piece of code originally only read the shared memory, so it can=20= >> use LW_SHARED lock mode, but now it writes to the shared memory, do = we=20 >> need to change the lock mode to =E2=80=9Cexclusive=E2=80=9D? >=20 > No, LW_SHARED is sufficient here, since the backend only modifies its = own state, > and no other backend could do that, without holding an exclusive lock. Yes, the backend only modifies its own state to =E2=80=9Cfalse=E2=80=9D, = but other backends may set its state to =E2=80=9Ctrue=E2=80=9D, that is = a race condition. So I still think an exclusive lock is needed. >=20 >> 6 >> ``` >> +static inline void >> +ChannelHashPrepareKey(ChannelHashKey *key, Oid dboid, const char = *channel) >> +{ >> + memset(key, 0, sizeof(ChannelHashKey)); >> + key->dboid =3D dboid; >> + strlcpy(key->channel, channel, NAMEDATALEN); >> +} >> ``` >>=20 >> Do we really need the memset()? If =E2=80=9Cchannel=E2=80=9D is of = length NAMEDATALEN,=20 >> then it still results in a non-0 terminated key->channel; if channel = is=20 >> shorter than NAMEDATALEN, strlcpy will auto add a tailing =E2=80=98\0=E2= =80=99. I think=20 >> previous code should have ensured length of channel should be less = than=20 >> NAMEDATALEN. >=20 > Yes, I think we need memset, since I fear that when the hash table = keys > are compared, every byte of the struct might be inspected, so without > zero-initializing it, there could be unused bytes after the null > terminator, that could then cause logically identical keys to be = wrongly > considered different. >=20 > I haven't checked the implementation though, but my gut feeling says > it's better to be a bit paranoid here. The hash function channel_hash_func() is defined by your own code, it = use strnlen() to get length of channel name, so that bytes after = =E2=80=98\0=E2=80=99 won=E2=80=99t be used.=20 And I guess you missed comment 9: 9 ``` + int allocated_listeners; /* Allocated = size of array */ ``` For =E2=80=9Csize=E2=80=9D here, I guess you meant =E2=80=9Clength=E2=80=9D= , though =E2=80=9Csize=E2=80=9D also works, but usually =E2=80=9Csize=E2=80= =9D means bytes occupied by an array and =E2=80=9Clength=E2=80=9D means = number of elements of an array. So, =E2=80=9Clength=E2=80=9D would be = clearer here. And I got a new comment for v12: 10 ``` + found =3D false; + foreach(q, channels) + { + char *existing =3D (char *) lfirst(q); + + if (strcmp(existing, channel) =3D=3D 0) + { ``` Might be safer to do =E2=80=9Cstrncmp(existing, channel, = NAMEDATALEN)=E2=80=9D. Best regards, -- Chao Li (Evan) HighGo Software Co., Ltd. https://www.highgo.com/ --Apple-Mail=_26BAA13C-A12D-42A7-AF6A-9B19C26FDB54 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8

On Oct 8, 2025, at 22:53, Joel Jacobson = <joel@compiler.org> wrote:

1
```
+ = channels =3D = GetPendingNotifyChannels();
+
LWLockAcquire(NotifyQueueLock, = LW_EXCLUSIVE);
- for (ProcNumber i =3D QUEUE_FIRST_LISTENER; i !=3D = INVALID_PROC_NUMBER; i 
=3D QUEUE_NEXT_LISTENER(i))
+ = foreach(lc, channels)
```

I don=E2=80=99t see where = =E2=80=9Cchannels=E2=80=9D is freed. = GetPendingNotifyChannels() 
creates a list of Nodes, both the = list and Nodes the list points to 
should be = freed.

Per suggestion from Tom Lane I reverted back = GetPendingNotifyChannels(),
so this comment is not applicable any = longer.

I think you just reverted the = usage of list_member() and makeNode(), but returned =E2=80=9Cchannels=E2=80= =9D is still built by =E2=80=9Clappend()=E2=80=9D that allocates memory = for the List structure. So you need to use =E2=80=9Clist_free(channels)=E2= =80=9D to free the memory.

5
```
/*
@@ -1865,6 +2087,7 @@ = asyncQueueReadAllNotifications(void)
LWLockAcquire(NotifyQueueLock, = LW_SHARED);
/* Assert checks that we have a valid state entry = */
Assert(MyProcPid =3D=3D QUEUE_BACKEND_PID(MyProcNumber));
+ = QUEUE_BACKEND_WAKEUP_PENDING(MyProcNumber) =3D false;
```

This = piece of code originally only read the shared memory, so it can 
use LW_SHARED lock = mode, but now it writes to the shared memory, do we 
need to change the lock = mode to =E2=80=9Cexclusive=E2=80=9D?

No, LW_SHARED is sufficient here, since the = backend only modifies its own state,
and no = other backend could do that, without holding an exclusive = lock.

Yes, the backend only = modifies its own state to =E2=80=9Cfalse=E2=80=9D, but other backends = may set its state to =E2=80=9Ctrue=E2=80=9D, that is a race condition. = So I still think an exclusive lock is needed.


6
```
+static inline = void
+ChannelHashPrepareKey(ChannelHashKey *key, Oid dboid, const = char *channel)
+{
+ memset(key, 0, sizeof(ChannelHashKey));
+ = key->dboid =3D dboid;
+ strlcpy(key->channel, channel, = NAMEDATALEN);
+}
```

Do we really need the memset()? If = =E2=80=9Cchannel=E2=80=9D is of length NAMEDATALEN, 
then it still results = in a non-0 terminated key->channel; if channel is 
shorter than = NAMEDATALEN, strlcpy will auto add a tailing =E2=80=98\0=E2=80=99. I = think 
previous = code should have ensured length of channel should be less than 
NAMEDATALEN.

Yes, I think we need = memset, since I fear that when the hash table keys
are compared, every byte of the struct = might be inspected, so without
zero-initializing it, there could be unused bytes after the = null
terminator, that could = then cause logically identical keys to be wrongly
considered different.

I haven't checked the implementation = though, but my gut feeling says
it's = better to be a bit paranoid here.

The hash = function channel_hash_func() is defined by your own code, it use = strnlen() to get length of channel name, so that bytes after =E2=80=98\0=E2= =80=99 won=E2=80=99t be used. 

And I guess you = missed comment = 9:

9
```
+ int = allocated_listeners; /* Allocated size of array = */
```

For =E2=80=9Csize=E2=80=9D = here, I guess you meant =E2=80=9Clength=E2=80=9D, though =E2=80=9Csize=E2=80= =9D also works, but usually =E2=80=9Csize=E2=80=9D means bytes occupied = by an array and =E2=80=9Clength=E2=80=9D means number of elements of an = array. So, =E2=80=9Clength=E2=80=9D would be clearer = here.

And I got a new comment for = v12:

10
```
+ = found =3D false;
+ foreach(q, = channels)
+ {
+ = char =   *existing =3D (char *) = lfirst(q);
+
+ if = (strcmp(existing, channel) =3D=3D 0)
+ = {
```

Might be safer to = do =E2=80=9Cstrncmp(existing, channel, = NAMEDATALEN)=E2=80=9D.

Best regards,
--
Chao Li (Evan)
HighGo Software = Co., Ltd.
https://www.highgo.com/




= --Apple-Mail=_26BAA13C-A12D-42A7-AF6A-9B19C26FDB54--