Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wD4YO-002Zm9-22 for pgsql-hackers@arkaria.postgresql.org; Wed, 15 Apr 2026 18:01:33 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wD4YM-001EGs-0q for pgsql-hackers@arkaria.postgresql.org; Wed, 15 Apr 2026 18:01:30 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wD4YL-001EGj-2a for pgsql-hackers@lists.postgresql.org; Wed, 15 Apr 2026 18:01:29 +0000 Received: from mail-yw1-x1131.google.com ([2607:f8b0:4864:20::1131]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1wD4YJ-00000001BcP-1mM4 for pgsql-hackers@postgresql.org; Wed, 15 Apr 2026 18:01:28 +0000 Received: by mail-yw1-x1131.google.com with SMTP id 00721157ae682-79a46ebe2beso73218547b3.2 for ; Wed, 15 Apr 2026 11:01:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1776276087; cv=none; d=google.com; s=arc-20240605; b=BmcYRVICMwb2Tb8DfHYBrSbnvQeKF8VCbYaDEP7G2Cv4bLUys9ZLT8bTLczuKCHns4 GiJvP2Z4qU/iIZnLcNFGPUsL73joUzVEJlBfXFJ/TuujIT/bKTt6BXMMfVDHDyEI6f4D wTb3NyCdzear0Cz/fGmKlcRRaK6kmq+7AWgGxXGuQYdrV6CnhgVOxBg0jNHp+Ly5SY5Z Jrq7GHAKhwPbY9G0WUiQSUEESCeY0H2DQuC9bD9tkJKRT0H6ZVpzyK6fEZmGEDTlwxAZ jAKOR5gzoyz97oXQz6PmPDWlvVyszmWHsTx60mzIkFYE/pJqEQPjl2lxBFkTXnrZlgnX T28g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:dkim-signature; bh=FmlTKSMA6QutaVzG0aHXf5MLSVQ1VZmbZLsPx6OzUsw=; fh=dL0A5gz9hAqUpLeEAhznxrouVFZro37e4dbO2gm4P2w=; b=U7dm+vhdPxoiXcTYOXLnu/mN6bGIBzm2c1hGysd5+WJxCOMENvTODe4zKV+evHAK6t 91oLLaVEk7sIPC43tSskuaSuGNXfIOI3Ke+QQcZt6HfjOMmgEo6mQBQoVNhf7VDmJPq/ HgoswhOc2Go4GTvm2wrKkNr4vof9QS4bG2BO/3tQgnCqhnH/TjAdR3SQ25FK0z+owOFH P5dJpurN0KU5kr3Jh+uV3KIHbyac+JRceOrYENvd7PZt9VTKpceC/pS8Tj7YJE/Qwyg4 E0jaz3bnj3i02zMFW+D4AAhqASEHfR5XUJeqJFXRWaTUuiU8JxYJ3O6vHvBkuxST1zrQ NoLw==; darn=postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1776276087; x=1776880887; darn=postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=FmlTKSMA6QutaVzG0aHXf5MLSVQ1VZmbZLsPx6OzUsw=; b=BGIRBd2B0HaJYqvfVJqM8JNreh2F1q13FmH7dPoOP5vSPbJ+toqormMp0eEUfMb2dy 68LoujIXXTd0L6cbMsiszGpTFsSOPQ1J3WaZ4zH0LqCkv/nyIEeuh3OURLEAS1aUFFgW GW6ppjYEMuGlY0ds9640O/gGJpK0Uryh4jcwKbg++XA0JN9JKYMhcIIgv2FpTM3fq/q9 6IhrDqTEtlytHh71eKLegAIWetccG5C7qDaJi5b2dNv0b+L9wWV0VmfV8ERUAyiuM2pg nklYUiovx4iK3LJ0deHln7qOFOQ1ZJr/jIaZNH6q4oqVTp1LkOD0yneDOK6cSsyzCIhW mKrg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1776276087; x=1776880887; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=FmlTKSMA6QutaVzG0aHXf5MLSVQ1VZmbZLsPx6OzUsw=; b=osM95slYKve8MzwqeB/jOoPIvh6hc3UGWA+79ZQ8ZpmiE+Sgcixk8kqls8N4L8WgPi /uT1BUr+BqgkdisnJ4nrHJf6s3rTAFv64k5M5jh8C9irwLEgajv9st4VBRgjEroAJWm2 cp2XozeQd5g976rNSuiqg6a+gL1hG7kkulcjOOpfnpul1faNd5jWMfRufVtuSD+nUSyE fGcXoPmKBpdSpHi2PbRYPmkkw0S5568FjOp9hpzsseIfoMsI/ZBcK4shjADiUfPktdpk OHyE6CjdTkWnaqaloXkfGr79ZVcs4vvWFnfyOReR8Z2bG8wtlyKhc+2U6ULeFsut8dOb fSrA== X-Gm-Message-State: AOJu0Ywomog0msfd0p7ZcLZwB7jrkWz1cdPf/iblJr9AZ3shfaUcw7H2 2o3Vd6zS8ljEqRchko2OERPR1ANB3s6vwsSI9DbFz7p5841xV4cVFYSHk1C7aWnOms0vzqS9Lup XyYkNXVdK2giEIF8dA/Gm9IJSD5z8BIs= X-Gm-Gg: AeBDievVkFZ0Iq5uuLePZ7SlqGOTXbxpJ5M18w985kBb6dY5jDobovd4rEdoHkt9uWk KYf8zBBTqEJ8tYz7ZaoP8UkvgHEgKBagOSZBqDqY3JGsmtlsMYQQNFwVCtPJt3veNQsUmZJ6gU9 GAm93jF/R3qbyCC8HbgZYO71KLMFMD1qMGEuIW3YY/M9u7XzX5qA529LJxO6BppwpHd65UxD3ls re+hcWGZQGJg6bdcpYzLRi87RU9kZVmacDl0pjOiRlUlgbhxGhNxBxlXroDD196M6/zUl0Q0wb0 l8Q+cTmmxfgL84bN8vvfgmsq9+EqryDAMtBe9gHfxpWOWm3Siw== X-Received: by 2002:a81:8a47:0:b0:7b4:b591:e77f with SMTP id 00721157ae682-7b4b591ff1bmr68243087b3.32.1776276083399; Wed, 15 Apr 2026 11:01:23 -0700 (PDT) MIME-Version: 1.0 References: <0f462532-9790-4334-b503-4ee522225820@iki.fi> In-Reply-To: <0f462532-9790-4334-b503-4ee522225820@iki.fi> From: Ayush Tiwari Date: Wed, 15 Apr 2026 23:31:10 +0530 X-Gm-Features: AQROBzDrSL7mcS8A6gxEXTTZenY5ZlastOVRhjUPkbvYb3xBuYaQztjnxy1gSPE Message-ID: Subject: Re: [PATCH] postmaster: fix stale PM_STARTUP comment To: Heikki Linnakangas Cc: pgsql-hackers@postgresql.org, "noah@leadboat.com" Content-Type: multipart/alternative; boundary="000000000000f96387064f8380a1" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000f96387064f8380a1 Content-Type: text/plain; charset="UTF-8" Hi On Wed, 15 Apr 2026 at 22:21, Heikki Linnakangas wrote: > On 15/04/2026 16:57, Ayush Tiwari wrote: > > Hi, > > > > The comment above the PM_STARTUP startup-process-failure case still says > > that there are no other processes running yet, so the postmaster can just > > exit. > > > > That no longer matches the current startup flow: PM_STARTUP may already > > have auxiliary processes running by that point. The attached patch > updates > > that comment to describe the current behavior. > > Hmm, shouldn't the postmaster kill and wait for the auxiliary processes > to exit first in that case? ISTM we need code changes here, not just > comments. > > - Heikki > > Yes, I agree, code change is required here. The proper thing is to route this through the existing crash-handling path so the postmaster SIGQUITs the aux children and waits for them to exit before terminating. I think the minimal change is: 1. Replace the ExitPostmaster(1) shortcut in the PM_STARTUP startup-failure case with HandleChildCrash(), which calls TerminateChildren(SIGQUIT) and transitions through the state machine. Set StartupStatus = STARTUP_CRASHED so the state machine does not try to reinitialize. 2. Let HandleFatalError() handle PM_STARTUP by transitioning to PM_WAIT_BACKENDS, instead of the current Assert(false). The state machine already handles STARTUP_CRASHED at PM_NO_CHILDREN ("shutting down due to startup process failure"), so the exit path is already correct once all children have drained. This issue was discussed in an older thread by Noah too, so, adding him in cc. I can send in a proper patch if you think this is the right way to go. Regards, Ayush --000000000000f96387064f8380a1 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi

On Wed, 15 Apr 20= 26 at 22:21, Heikki Linnakangas <hlin= naka@iki.fi> wrote:
On 15/04/2026 16:57, Ayush Tiwari wrote:
> Hi,
>
> The comment above the PM_STARTUP startup-process-failure case still sa= ys
> that there are no other processes running yet, so the postmaster can j= ust
> exit.
>
> That no longer matches the current startup flow: PM_STARTUP may alread= y
> have auxiliary processes running by that point. The attached patch upd= ates
> that comment to describe the current behavior.

Hmm, shouldn't the postmaster kill and wait for the auxiliary processes=
to exit first in that case? ISTM we need code changes here, not just
comments.

- Heikki


Yes, I agree, code change is required = here.

The proper thing is to
route this through the existing cras= h-handling path so the postmaster
SIGQUITs the aux children and waits fo= r them to exit before terminating.

I think the minimal change is:
=C2=A0 1. Replace the ExitPostmaster(1) shortcut in the PM_STARTUP
= =C2=A0 =C2=A0 =C2=A0startup-failure case with HandleChildCrash(), which cal= ls
=C2=A0 =C2=A0 =C2=A0TerminateChildren(SIGQUIT) and transitions throug= h the state
=C2=A0 =C2=A0 =C2=A0machine.=C2=A0 Set StartupStatus =3D STA= RTUP_CRASHED so the state
=C2=A0 =C2=A0 =C2=A0machine does not try to re= initialize.

=C2=A0 2. Let HandleFatalError() handle PM_STARTUP by tr= ansitioning to
=C2=A0 =C2=A0 =C2=A0PM_WAIT_BACKENDS, instead of the curr= ent Assert(false).

The state machine already handles STARTUP_CRASHED= at PM_NO_CHILDREN
("shutting down due to startup process failure&q= uot;), so the exit path is
already correct once all children have draine= d.

This issue was discussed in an older thread by Noah too, so, addi= ng him in cc.

I can send in a proper patch if you think this is the = right way to go.

Regards,
Ayush
--000000000000f96387064f8380a1--