Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wFctJ-005NDb-1d for pgsql-hackers@arkaria.postgresql.org; Wed, 22 Apr 2026 19:05:42 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wFctI-00EsIx-2G for pgsql-hackers@arkaria.postgresql.org; Wed, 22 Apr 2026 19:05:40 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wFctH-00EsIp-2W for pgsql-hackers@lists.postgresql.org; Wed, 22 Apr 2026 19:05:40 +0000 Received: from fhigh-a5-smtp.messagingengine.com ([103.168.172.156]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1wFctE-00000002WIK-3hmt for pgsql-hackers@lists.postgresql.org; Wed, 22 Apr 2026 19:05:39 +0000 Received: from phl-compute-02.internal (phl-compute-02.internal [10.202.2.42]) by mailfhigh.phl.internal (Postfix) with ESMTP id 2D283140001D; Wed, 22 Apr 2026 15:05:34 -0400 (EDT) Received: from phl-frontend-04 ([10.202.2.163]) by phl-compute-02.internal (MEProxy); Wed, 22 Apr 2026 15:05:34 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=anarazel.de; h= cc:cc:content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1776884734; x=1776971134; bh=nqVS6zYVXo ZyTnqtZklc64Cup1qysW9a2m1DFuvOcc4=; b=t08EklzuuZfD/gp7faeiS3gGow 648Ix7s0bHRJUEizEZqEZ9sL6QO+vHupdmK20boXaGoEHFa8STU7wapQeraKNDWy l/M8uVVhIIyE4eISyhKN6I1pVqC3H9a0SSERTNhmGfJ4a6gfg0kzKvCTUkMVb/tl 4lS9CaeCI36KJEzHKt3/zpBTgiTdN7C+1HYbCi+OAo6Mf4geYPWIvgp6Lah+B6nj 4ITU+6WbcM1eUGTo52s9mFDUeYg9Ro2wVA7uhH9Szyl5vHto4DzBpID2Kd7sFMMM FejPsoejSCoxrGvFEQEyfE72lWNhWmWDhWyFr5h24ZLc5V/CT6xKsHCPvm2g== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t= 1776884734; x=1776971134; bh=nqVS6zYVXoZyTnqtZklc64Cup1qysW9a2m1 DFuvOcc4=; b=pxI2md9oHH9Z0sFFIFsUdhW2pGFsKfUeQMUMH4OIpkXtDPTrvCK dmjxieb2pQqZKjL2lF6FSsv/Y9B2QroFFziui/XgDnJPx1bU2BuJ910h9h0UUXP6 igm8HWEFcuITSOHJVHooPPqYSBV8DOLsHrKiwhu12TQ2wJsAz2zbEXqMKdbfJlUa y2DY9mEme9aLUcZVQ3fW1MPWnTvakCJ/grt4Ms80/BZNekJ8+P9lJQqrV4eoGGkI HIhNCq1TvbWH09mBgttZ4qZR6vVDdpM0jKguLV/rxAqzDcxhpMSEcjWBTiqjqyRi k/STL6UC17LzJNsVb4qHkhDm7/zo9NgHXYA== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeefhedrtddtgdeihedtiecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpuffrtefokffrpgfnqfghnecuuegr ihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenucfjug hrpeffhffvvefukfhfgggtuggjsehttdfstddttddvnecuhfhrohhmpeetnhgurhgvshcu hfhrvghunhguuceorghnughrvghssegrnhgrrhgriigvlhdruggvqeenucggtffrrghtth gvrhhnpeeffffgledvffegtdevlefgtdeggffhvdekgfegteeiveejkeetudelveejhfeu geenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpegrnh gurhgvshesrghnrghrrgiivghlrdguvgdpnhgspghrtghpthhtohephedpmhhouggvpehs mhhtphhouhhtpdhrtghpthhtohepsghovghkvgifuhhrmhdophhoshhtghhrvghssehgmh grihhlrdgtohhmpdhrtghpthhtohepshgrfigruggrrdhmshhhkhesghhmrghilhdrtgho mhdprhgtphhtthhopehthhhomhgrshdrmhhunhhrohesghhmrghilhdrtghomhdprhgtph htthhopehhlhhinhhnrghkrgesihhkihdrfhhipdhrtghpthhtohepphhgshhqlhdqhhgr tghkvghrsheslhhishhtshdrphhoshhtghhrvghsqhhlrdhorhhg X-ME-Proxy: Feedback-ID: id4a34324:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 22 Apr 2026 15:05:33 -0400 (EDT) Date: Wed, 22 Apr 2026 15:05:33 -0400 From: Andres Freund To: Matthias van de Meent , Thomas Munro Cc: PostgreSQL Hackers , Heikki Linnakangas , Masahiko Sawada Subject: Re: Startup process deadlock: WaitForProcSignalBarriers vs aux process Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hi, On 2026-04-22 13:21:02 +0200, Matthias van de Meent wrote: > If the PSB is emitted (and signaled to checkpointer) before the > checkpointer has registered its SIGUSR1 handler, then the checkpointer > won't receive the notice to check its procsignal slots, it won't > notice the updated procsignal flags, and it won't process the PSB; not > until it receives a new SIGUSR1. > > Signals are sent to all processes that have their procsignal pss_pid > set, which is true for every process which has called ProcSignalInit, > which for the checkpointer (like other aux processes) happens in > AuxiliaryProcessMainCommon. However, checkpointer (also like other aux > processes) calls AuxiliaryProcessMainCommon before registering its > signal handlers, creating a small window in time where signals are > sent, but not handled. Hm. Have we confirmed this happens? CheckpointerMain() is called with all signals masked, so it should be ok for the signal handler to only be set up after AuxiliaryProcessMainCommon(), as long as it happens before /* * Unblock signals (they were blocked when the postmaster forked us) */ sigprocmask(SIG_SETMASK, &UnBlockSig, NULL); as the signal delivery should be held until after unblocking signals. > # A solution? > > I don't have one right now. > I was thinking in the direction of having a compile-time aux process > signal handlers array per process type, which is read by > AuxiliaryProcessMainCommon() to register the signal handlers ahead of > ProcSignalInit(), but I've not yet looked at the exact implications, > nor analyzed whether that's actually safe. It would move some > duplicative code patterns into compile-time structs, but that's not > necessarily a universal good. We really should move setup of most signal handlers into AuxiliaryProcessMainCommon(). While there are some special cases (like checkpointer not wanting to handle SIGTERM), that can be configured after AuxiliaryProcessMainCommon(), as signals will still be blocked. Greetings, Andres Freund