Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vyf6i-000JBU-2F for pgsql-hackers@arkaria.postgresql.org; Sat, 07 Mar 2026 00:01:25 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vyf6f-007Ufl-1z for pgsql-hackers@arkaria.postgresql.org; Sat, 07 Mar 2026 00:01:22 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vyf6f-007Ufc-0Q for pgsql-hackers@lists.postgresql.org; Sat, 07 Mar 2026 00:01:21 +0000 Received: from mail-lj1-x233.google.com ([2a00:1450:4864:20::233]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1vyf6c-00000000tkq-2Z6H for pgsql-hackers@lists.postgresql.org; Sat, 07 Mar 2026 00:01:20 +0000 Received: by mail-lj1-x233.google.com with SMTP id 38308e7fff4ca-38a3990e87fso10911161fa.3 for ; Fri, 06 Mar 2026 16:01:18 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1772841675; cv=none; d=google.com; s=arc-20240605; b=kWNO+OlIPjuMiurEonECn+pWGoDf7MHJW7uy6U2t/MLD3wwlsU0aJKGtpJRgt0KyA/ /mZrD8+RFANOAoPvxO77k2mGZdPYUYoxAQeCjj/Ab0afYeUbLzi6Q6TgaxtkeoYuZEQO eLod0MauAQtV+54o0udUo7d04VsYc1bs428KnoQJVzkLGeJ1n4GHYCKV9bKBmDDFhrzW Xg+AFpuBP1LsfeahSVfHoSBHRDRc9cQMjqvJXpinHwg/ay2WtO70/jluwlALRgwVjEPO kMFmJB0tl5nNxIlIupdhHB1GoEDEU8JRE2j6znRIhEQyVay1BN8Qh0qJFbT1wAc9XdjY FQiQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:dkim-signature; bh=MGrDLuwmVqJornbtM4GsxoqvfWybKAXPxzoeyO22Ac0=; fh=fLz48Io308dVpThosvRNBKpXawh7Z9MTM//Jh3RoRH8=; b=TNgEEa+z/kGkMhdvfams7X0DiCUNqRfhcJQabf+omy/CPyIdVOMV99ZTVRgNdUd4sG SFWDf6zS/cb+/9h2LU5OmZfi0YhcKrKJ/bCYdBij/P4U7yDGqpsCbw+NeykJJY0b4pLV jn3FtvsieYbMYGCrC7JO0509A0rFsUBhVU73emUA3YclK8Rg4NnG65vOPmguz62Zl0Yj 1RpIwKxbL28dPiWZjKkU756bFPq7pkW7hTwW8lkElgT3xl5w/g1R4x9WGWRWbU6Ax1EZ sXom4UeQ+VU06BlYvPuUw7Q6Lhix7UDiM3vhuXl+HAar2R0ZkQPDEma2lPIizzOzfL5A IDtQ==; darn=lists.postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=jeltef.nl; s=google; t=1772841675; x=1773446475; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=MGrDLuwmVqJornbtM4GsxoqvfWybKAXPxzoeyO22Ac0=; b=mJ/V07jSRKUYSKQ2rrS1WHbAceFjfowS13nDEmDF2+djhG2KWeab+jFDBnGAfbQiuT 9mWpxwt54Tpq6PcJw3AdLEIdDqAxt3PMBEUuH+iNtX0OFI6qq3P/p3RaPgd9iNeswiOG Tf8PYZZwNk2kPyzswHeIzowMSS7MSUF6RoIEE9qqvCIMkL9FT1kdgk4BD3v8qJGo1AoE ASyMoGVJNKEE5CD9rucJGCAjcxdfejPOfe4uoknMIoCJHJj4Ox9SUq/CP/aXmXV341wi D53/DD3ELvcwrzpqSPnpMeC6krnt/t5MIAS55qHgYQP4zRZWZQX3l75dmUe1GwMTmHGP EgDQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1772841675; x=1773446475; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=MGrDLuwmVqJornbtM4GsxoqvfWybKAXPxzoeyO22Ac0=; b=ipG4BUKKHyLeLKcexeMWl2aFRL8laoud6K6KRFYOkg9d1GfWm89fogUxZRvrkNMaaO S8uvxy82lDN71Jx3naL3nbPyL6o+eHdl0QVfr9tcJBy0DM9uAX72W/Do+F5ek6uUdH8W UbXyXqw8ow6YNAxsr0k31aajzPhUjg9/gWUPCC5b6UrnDDs22TbWRD5nHjU/n9UdWN3M RiI863Hup1bORTa1UqmLLa3cRgIz3dc3/Kh8s6PqHMZDyoyS81O8LDDDNwSrsRuK2Bi/ WdqNYMXg3ltMu+UMw7kbmtR3mpXOqSmjBNxjK2ooYsm4Vdwom6suZDhxeV8NkaI1d7v0 otdQ== X-Gm-Message-State: AOJu0YwdPplhk7AdxJWYYddfNiPxWOh27mRMBBHClsoe0Ch514F6pv0I nEJwX2jJVUcyaJI9ZzUVEGS193OS9HWRzecCEAxFnndAfS2SSAkeOBVIeSDz0Tt82pa2i1V27Nj km78y4Z7UUR0t8KKsugDLcibnT04ovK04XSOSnrTB+sJh0PsBk6pzXg8= X-Gm-Gg: ATEYQzzV1wUhXlxEWaE4gMfwdUldJUsucg4t9sESskVDSe2K47OTCTap4fBx4SRH79O GNjikp0tu6OhC8YvLGNmC7Y6RPQN356ciKBWtreRtl940zG0E9jXEYQm0dJ6j/uYb9ykwkXXiEv N5FIUHnPgnmZDQNRpcUDWr868kR6ktU/mumX5fnc9ZUKnSuJQd0tHAGhkyod0UVJIfawjVDRaw/ OodZhLqlmJNhRLjc1mGv6aYYK4h5sGYJyG8luBsJRPrFqEwHONfhwlHLlCLwGYQAa5MscJORVji M8Le9ns6 X-Received: by 2002:a2e:b8c3:0:b0:387:384:923e with SMTP id 38308e7fff4ca-38a40da821fmr14250811fa.37.1772841674935; Fri, 06 Mar 2026 16:01:14 -0800 (PST) MIME-Version: 1.0 References: <88dfe280-ba29-4943-95b8-63abc9f3f771@iki.fi> <9d7ba3ac-d660-483e-8f68-9096a2464e90@iki.fi> In-Reply-To: <9d7ba3ac-d660-483e-8f68-9096a2464e90@iki.fi> From: Jelte Fennema-Nio Date: Sat, 7 Mar 2026 01:01:03 +0100 X-Gm-Features: AaiRm50-5tcm92tnfECHdPWf8CSMQy_mYAP4s5qAGxYKkwJ6ty5w5qssPzpocMM Message-ID: Subject: Re: Don't use the deprecated and insecure PQcancel in our frontend tools anymore To: Heikki Linnakangas Cc: PostgreSQL Hackers , Alvaro Herrera , Jacob Champion Content-Type: text/plain; charset="UTF-8" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Fri, 6 Mar 2026 at 20:51, Heikki Linnakangas wrote: > > In theory we could reduce the window for the race, by having all > > frontend tools use async connections and have the main thread wait for > > either the self-pipe or a cancel. That way it would be more similar to > > the previous signal code in behaviour. That's a much bigger lift though, > > i.e. all PQexec and PQgetResult calls would need to be modified. My > > proposed change doesn't require changing the callsites at all. > > Yeah, it does have that advantage.. I let Claude Code take a stab at a POC for doing the same thread approach. So I could get a more accurate feeling of how big this lift would be. It's a much bigger change, but the general design is relatively straight forward. It's attached as the nocfbot patch (it's not built on top of any of the other patches, it's a separate one). > One simple thing we could is to remember the "generation" in the signal > handler, and store it in another global variable ("cancelledGeneration" > or such). In the cancel thread, check that the generation matches; > otherwise the thread is about to send a cancellation to a query that > already finished, and should not send it. >In the cancel thread, check that the generation matches; otherwise the thread is about to send a cancellation to a query that already finished, and should not send it. I took a look at this, and attached a fixup patch that does this. It uses C11 atomics because locks cannot be taken in the signal handler and the signal handler needs to read/write variables from/to two different threads. I'm not sure if it pulls it's own weight though. It seems a really unlikely scenario where the signal handler is fired during one generation, but then the cancel thread wakes up during another. I'm not sure if we should care about it. And I actually think it could make us miss cancel a SIGINT for non-interactive use cases of psql. > I worry how this behaves if establishing the cancel connection gets > stuck for a long time. Because of a network hiccup, for example. That's > also not a new problem though; it's perhaps even worse today, if the > signal handler gets stuck for a long time, trying to establish the > connection. Still, would be good to do some testing with a bad network. To make sure we're talking about the same situation, let me summarize it differently: Establishing the connection for the cancel is slow, but the actual query connection is still fast. I think this is an interesting case to consider (much more interesting than the kind of race the additional generation check could protect against). First of all because I think it can definitely happen. Especially with SSL the cancel needs several round trips, while a query on an already established connection only needs one direction latency. I can definitely see how this could cause out-of-order arrival even on stable high-latency networks. Especilaly if there's also some unfortunate packet drops. And as you suggested the failure modes are different: - With master, psql will become unresponsive until the client gets a response from the server (or tcp timeouts are hit) - With this patchset, a later query will get cancelled. I think for interactive psql usage both are annoying, but both are not the end of the world. I think I would personally prefer the current master behaviour. I'm not sure preserving it is worth all the additional code changes though to make all the applications use non-blocking APIs. In any case SSL for cancel keys is definitely worth the patchset behaviour to me (even though it sounds slightly worse). For non-interactive use (i.e. running scripts in psql or other frontend tools like vacuumdb). I don't think this situation applies. You want whatever query to be cancelled.