Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1ujphx-00CWEt-CH for pgsql-hackers@arkaria.postgresql.org; Thu, 07 Aug 2025 01:46:17 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1ujphv-002AMY-1v for pgsql-hackers@arkaria.postgresql.org; Thu, 07 Aug 2025 01:46:15 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1ujphu-002AMO-OM for pgsql-hackers@lists.postgresql.org; Thu, 07 Aug 2025 01:46:14 +0000 Received: from mail-ot1-x330.google.com ([2607:f8b0:4864:20::330]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1ujphp-001AQy-1U for pgsql-hackers@lists.postgresql.org; Thu, 07 Aug 2025 01:46:12 +0000 Received: by mail-ot1-x330.google.com with SMTP id 46e09a7af769-741b3d65ed9so5424a34.0 for ; Wed, 06 Aug 2025 18:46:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1754531167; x=1755135967; darn=lists.postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=3C90daiWr+tHB/FFzqSQoJSOL8TYJnt9uZSks7tdSM0=; b=Pgp+xHJv8TmjekJ9Z0Sp+OX1hH5EX4eSfLT4bpxwzAhMdRVI1mANmNgv7plrrKJsxT AsJQCA5vrNprPln5EGjhaLvtb//K9gATpPePLvtgUGWD1woy4LP+CGZeRvhR2m2X176g NFeQlT5fC/ikbzKWaEmi4ZIaa5LyRY0heL9LKbIP+RMjEW2f7q9HK2piv7Vt5yXyeU2p 3SSLpQII32MRUj6ky4jufH04KZAUcoyI3XK+hfADuA/ajW5O2NeuBug3FtIja/TFl0xb jYbmLG+D2kcsAof3kVJIbKT+kaOtSfA0Nsmxmz1dguZUDw8GKA6gTTZh8+ISKNxwbxGm kiJw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1754531167; x=1755135967; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=3C90daiWr+tHB/FFzqSQoJSOL8TYJnt9uZSks7tdSM0=; b=YpuAYnq54nsuIymKl/XTk81suVgKrZ+MPchS9CBDz+dT+RIpu+wzNQKSRfdI8IqboD w37xEI6JPbVaz5nvGP899kHzigiLgmFzrRFfiYWMxQWBE+HgvHCQC+4XoypCrO3L0v3p H4MgGII5vPitGwjfmeP3hBXGERiFNEI1Bv84WYTHEI7PG1EkmChmcv5S4PytnDVtx8BF u/PqRgInR/7e6I9LINX+3mnvb99ZUMhf7OyA+dqNeF3Yd2IC7rkAFykVksSL+bYklSHq Pqqkh7SWyABHuQePRtidjD8G35r0GBKuL64CerLC5VQK1Xf87lRKVOABxJ5PXiaFB1Ar jQKw== X-Gm-Message-State: AOJu0Ywe6oOJIIaFhS2cHtcJYwQV1ZaczT0T4cZe/NOFNqadaTTPouyD zYpUNxxriz6KsWRtMOGt3FCsL5OQw3GfrQJpUwPsZRMyKcU7eyNkdX1uqvp8ytRLUG443yFQoT5 ZQVAbE7YaYOUG/oT7nAgT+30mN2b3Y8ITn+Xu X-Gm-Gg: ASbGncvuaXXBf2RWGJJXtJrSX2uJyRX3TbS6pulwrDx8y3tGc4mqnlJVGrleaFw+bkR 6Ji8AzvZ3BSZr9zWJjqGwrR2lW63f+JQB4WFsZAo2DUOETco9Wgqt+abjpBh6E8AEzOHUqVCCTJ JhV/A2rp4Gt0/eTLzNiXuY7FjIxu/Of2mZCQqfMTbjJ9/bxNButJwxaPR0UzbAeSyUQmCOK8DMw +rx5neCyPrczDFkBtC0LSQLk/Vw8YUcyyC0vD3L X-Google-Smtp-Source: AGHT+IGZqTObXzrW4MH7Yr8iYTfNPNdQbydJ60bfQzRVAl2rjORf3C48A5t/e7YuIxlOXWmnm7sfOevgLzzdRSYj3UQ= X-Received: by 2002:a05:6830:34a5:b0:741:9383:f7d5 with SMTP id 46e09a7af769-74308e2c643mr1187378a34.1.1754531167149; Wed, 06 Aug 2025 18:46:07 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Thomas Munro Date: Thu, 7 Aug 2025 13:45:30 +1200 X-Gm-Features: Ac12FXzBp2WBZz6g96nAM9ot0CypMF879XJaDHhnuaJCCMuIxEdT2KEvOHOch_Y Message-ID: Subject: Re: [PATCH] OAuth: fix performance bug with stuck multiplexer events To: Jacob Champion Cc: PostgreSQL Hackers , Daniel Gustafsson , Peter Eisentraut Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Thu, Aug 7, 2025 at 11:55=E2=80=AFAM Jacob Champion wrote: > On Wed, Aug 6, 2025 at 9:13=E2=80=AFAM Jacob Champion > wrote: > > Maybe "drain" would no longer be the > > verb to use there. > > I keep describing this as "combing" the queue when I talk about it in > person, so v3-0001 renames this new operation to comb_multiplexer(). > And the CI (plus the more strenuous TLS tests) confirms that the > callback count is still stable with this weaker guarantee, so I've > gotten rid of the event-counting code. I was about to hit send on an email suggesting "reset_multiplexer()", and an attempt at distilling the explanation to a short paragraph, but your names and explanations are also good so please feel 100% free to ignore these suggestions. "Unlike epoll descriptors, kqueue descriptors only transition from readable to unreadable when kevent() is called and finds nothing, after removing level-triggered conditions that have gone away. We therefore need a dummy kevent() call after operations might have been performed on the monitored sockets or timer_fd. Any event returned is ignored here, but it also remains queued (being level-triggered) and leaves the descriptor readable. This is a no-op for epoll descriptors." Reviewing the timer stuff, it is again tempting to try to use an EVFILT_TIMER directly, but I like your approach: it's nice to be able to mirror the Linux coding, with this minor kink ironed out. FWIW I re-read the kqueue paper's discussion of the goals of making kqueue descriptors themselves monitorable/pollable, and it seems it was mainly intended for hierarchies of kqueues, like your timer_fd, with the specific aim of expressing priorities. It doesn't talk about giving them to code that doesn't know it has a kqueue fd (the client) and never calls kevent() and infers the events instead (libcurl). That said, the fact that the filter function for kqueue fds just checks queue size > 0 without running the filter functions for the queued events does seem like a bit of an abstraction leak from this vantage point. At least it's easy enough to work around in the kqueue-managing middleman code once you understand it. > Now that I'm no longer counting events, I can collapse the changes to > register_socket(). I can't revert those changes entirely, because then > we regress the case where Curl switches a socket from IN to OUT (this > is enforced by the new unit tests). But I'm not sure that the existing > comment adequately explained that fix anyway, and I didn't remember to > call it out in my initial email, so I've split it out into v3-0002. > It's much smaller. Much nicer! Yeah, that all makes sense. > The tests (now in 0005) have been adjusted for the new "combing" > behavior, and I've added a case to ensure that multiple stale events > are swept up by a single call to comb_multiplexer(). This all looks pretty good to me. I like the C TAP test. PostgreSQL needs more of this. s/signalled/signaled/ (=3D US spelling) in a couple of places.