Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wFXOX-005GwL-1q for pgsql-hackers@arkaria.postgresql.org; Wed, 22 Apr 2026 13:13:34 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wFXOV-00Do5j-1A for pgsql-hackers@arkaria.postgresql.org; Wed, 22 Apr 2026 13:13:31 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wFXOU-00Do5b-2r for pgsql-hackers@lists.postgresql.org; Wed, 22 Apr 2026 13:13:31 +0000 Received: from mail.postgrespro.ru ([93.174.132.70]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1wFXOP-00000002TaI-08bP for pgsql-hackers@lists.postgresql.org; Wed, 22 Apr 2026 13:13:30 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=postgrespro.ru; s=mx2023; t=1776863603; bh=luyNOev8ToyGc4mY70WdXqS6tv3/npceIbmDznei+fU=; h=Message-ID:Date:User-Agent:Subject:To:Cc:References:From: In-Reply-To:From; b=q0lC6VGZnVjxWzw9VqrhDxveqQtdpS7zgGm11nQ97P22RBwHAapoLlQOPk2AaGTqF 5KVccyj4jLph6D5tTijo81hKCwB0KuYli2AQJdGaRYfXwVGZiWJ6wuVVqph/Jfvvfl /vl1V5ngPUwdKLKSneDAXmNoh4I1BDN59i6PkOOfI//+stNnY4E5LCbXmgLAQR/wbb T8/+a80agMZ6ogAK6OJKRCftNCYiuLmnRXc144raPZurupRzwgBJJ2fIPi0Ia9wQVs C3QrR2iBt21nUXydyoERK1tMWhXEQV2yxyEwN/eQGDWiczXqgD+0aE8Z+7XqEy3+eC ocvMTK25buO3A== Received: from [10.3.12.129] (broadband-188-255-38-164.ip.moscow.rt.ru [188.255.38.164]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (Client did not present a certificate) (Authenticated sender: y.sokolov@postgrespro.ru) by mail.postgrespro.ru (Postfix/465) with ESMTPSA id 57F7E5FEE7; Wed, 22 Apr 2026 16:13:22 +0300 (MSK) Message-ID: Date: Wed, 22 Apr 2026 16:13:21 +0300 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: PoC: Add condition variable support to WaitEventSetWait() To: Xuneng Zhou Cc: Chao Li , PostgreSQL Hackers References: <63B9B6C7-BB04-4F6E-B3AC-6886DDC1C3FD@gmail.com> <00D3DE2F-10B8-4439-91B4-D1C24A98627A@gmail.com> Content-Language: en-US From: Yura Sokolov In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-KSMG-AntiPhishing: NotDetected X-KSMG-AntiSpam-Interceptor-Info: not scanned X-KSMG-AntiSpam-Status: not scanned, disabled by settings X-KSMG-AntiVirus: Kaspersky Secure Mail Gateway, version 3.0.0.9059, bases: 2026/04/22 10:31:00 #28428174 X-KSMG-AntiVirus-Status: NotDetected, skipped X-KSMG-LinksScanning: not scanned, disabled by settings X-KSMG-Message-Action: skipped X-KSMG-Rule-ID: 1 List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk 22.04.2026 05:58, Xuneng Zhou пишет: > Hi Yura, > > On Wed, Apr 22, 2026 at 1:06 AM Yura Sokolov wrote: >> >> 08.04.2026 11:22, Chao Li пишет: >>>> On Apr 8, 2026, at 11:50, Chao Li wrote: >>>>> On Apr 2, 2026, at 15:38, Chao Li wrote: >>>>>> On Mar 31, 2026, at 16:59, Chao Li wrote: >>>>>>> On Mar 31, 2026, at 15:28, Chao Li wrote: >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> There is an XXX comment in WalSndWait(): >>>>>>> ``` >>>>>>> * XXX: A desirable future improvement would be to add support for CVs >>>>>>> * into WaitEventSetWait(). >>>>>>> ``` >>>>>>> >>>>>>> I have been exploring a possible approach for that. This patch is a PoC that adds ConditionVariable support to WaitEventSet. This v1 is mainly intended to gather feedback on the design, so I have only done some basic testing so far, such as a normal logical replication workflow. >>>>>>> >>>>>>> I’d like to highlight a few key points about the design: >>>>>>> >>>>>>> 1. In the current WalSndWait(), although it prepares to sleep on a ConditionVariable, it does not actually check whether the CV has been signaled. In this PoC, I kept that same behavior. However, I tried to make the WaitEventSet support for CVs generic, so that if we want to add actual signal checking in the future, that would be possible. >>>>>>> >>>>>>> 2. To keep the design generic, this patch introduces a new wait event type, WL_CONDITION_VARIABLE. A WL_CONDITION_VARIABLE event occupies a position in the event array, similar to latch and socket events. When a CV is signaled, the corresponding WL_CONDITION_VARIABLE event is returned in occurred_events. >>>>>>> >>>>>>> 3. The WaitEventSet APIs AddWaitEventToSet() and ModifyWaitEvent() are extended to support CVs by adding one more parameter “cv" to both APIs. The downside of this approach is that all call sites of these two APIs need to be updated. I also considered adding separate APIs for CVs, such as AddWaitEventToSetForCV() and ModifyWaitEventForCV(), since CVs do not rely on the kernel and it might therefore make sense to decouple them from socket and latch handling. But for v1, I chose the more generic approach. I’d be interested in hearing comments on this part of the design. >>>>>>> >>>>>>> 4. One important point is that this patch extends the WaitEventSet abstraction, not the underlying kernel wait primitives. A ConditionVariable is still a userspace/shared-memory concept, but with this design it can participate in the same waiting framework as sockets and latches. I think that is useful because it allows mixed waits to be handled through one interface. >>>>>>> >>>>>>> Here is the v1 patch. >>>>>>> >>>>>> >>>>>> I just noticed that I missed checking in my last edit when switching to the other branch, so attaching an updated v1. >>>>> >>>>> PFA v2 - fixed a CI failure from contrib/postgres_fdw. >>>> >>>> Fixed a CI test failure and rebased. >>> >>> PFA v4 - try to fix a test failure on windows. >> >> Good day, Chao Li. >> >> ConditionalVariable works through the MyLatch. >> And almost always there is single latch for one backend: >> MyLatch = PGPROC->procLatch; >> >> (Single exception I found is XLogRecoveryCtl->recoveryWakeupLatch.) >> >> And that is where ConditionVariablePrepareToSleep were used in WalSndWait >> but without check: FeBeWaitSet always waits for MyLatch. >> >> Therefore, I don't clear get, how you suggest to simultaneously wait >> WL_LATCH_SET on MyLatch and WL_CONDITION_VARIABLE? >> How you will distinguish which one was fired? >> >> It looks to my, WL_LATCH_SET on MyLatch and WL_CONDITION_VARIABLE had to be >> mutual exclusive. At least unless ConditionVariable internal are changed. >> >> And check after waiting for conditional variable set had to be placed into >> WaitEventSetWaitBlock where all other such checks are. >> >> All written above is just my humble opinion. >> Excuse me if i'm too strict. >> > > I think you're correct that ConditionVariableSignal() wakes the target > process by calling SetLatch(&proc->procLatch). So every CV signal also > sets MyLatch. The patch introduces ConditionVariableSignaled() to > distinguish the two by checking CV wait-list membership (not the latch > state itself), so in principle the two sources are distinguishable. Nope. Patch may say if ConditionVariable was signalled. But it cann't say if MyLatch were set as well without touching ConditionVariable. It is not better than current state in term of code clearness and flawlessness. > In practice, however, the current patch seems not reliably report > WL_CONDITION_VARIABLE. > > Inside WaitEventSetWait(), the latch check runs before the new CV > check. When a CV signal fires, both conditions become true > simultaneously: MyLatch is set AND the proc has been removed from the > CV wait list. With nevents=1 (which is what WalSndWait passes), the > latch event fills the one-slot output buffer and the loop breaks > before reaching the CV check: > /* existing latch check — runs first */ > if (set->latch && set->latch->is_set) > { > ... > returned_events++; > if (returned_events == nevents) > break; /* CV check below is never reached */ > } > > /* new CV check — unreachable when buffer is full */ > if (set->cv) > cv_signaled = ConditionVariableSignaled(set->cv); > > So for the converted caller, WL_CONDITION_VARIABLE is not reliably > observable. In the common case where the latch readiness is returned > first, the one-slot buffer is filled by WL_LATCH_SET and the CV check > is skipped. Bingo! That is another case - opposite side of coin. That is why I say: "At least unless ConditionVariable internals are changed." -- regards Yura Sokolov aka funny-falcon