Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1smzvY-000Iy8-Dw for pgsql-hackers@arkaria.postgresql.org; Sat, 07 Sep 2024 18:12:53 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1smzuZ-001yvB-HV for pgsql-hackers@arkaria.postgresql.org; Sat, 07 Sep 2024 18:11:51 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1smzuY-001yv2-UH for pgsql-hackers@lists.postgresql.org; Sat, 07 Sep 2024 18:11:51 +0000 Received: from mail-pj1-x102c.google.com ([2607:f8b0:4864:20::102c]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1smzuV-0001dd-Jy for pgsql-hackers@postgresql.org; Sat, 07 Sep 2024 18:11:50 +0000 Received: by mail-pj1-x102c.google.com with SMTP id 98e67ed59e1d1-2d892997913so2096272a91.3 for ; Sat, 07 Sep 2024 11:11:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=leadboat.com; s=google; t=1725732706; x=1726337506; darn=postgresql.org; h=user-agent:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=Qr4B95gIWbZJFAkqVSln+Nl0H0LA44GyATZyGNFfdD4=; b=gjM2wf1Zi3KcyeLoi4gXs5nwHa8DUSqvnHfVkE0fqYy4puji9wAlruF82eCMS+9ouN UK23W6rDV8SItlO0pKtMNCjP4wAIz7P9FxOfaaMJwm5/9FRQWrvHt//PoY4eTngj1qE6 geAAsKShBlQVVs+T6KUeXawpUPRgFLc/l8q6w= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725732706; x=1726337506; h=user-agent:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Qr4B95gIWbZJFAkqVSln+Nl0H0LA44GyATZyGNFfdD4=; b=sSFOUQLOmzitJMDA9IB34ary3ipVd7uNZ70BaCIzmYtJoP5vi341c7PiDME9IRMg6V IXzgN8EOptM7ZZZwI37oGn/D1/Vz+jcPbN5m7i0xdC7oXacvHrjh0c9Di1Jo/4xlkeeI aw4x4HyMp2MhWYLPCg60aB2WFA5uGGRkVavutoHB+6sbNiOFtxQlESnjjwPZh26bzrQg tWI5GDxPqEksysPLOScWnM+cULKraTc4PMasvOarjva6XNT3jeviHRI1CqUqe9hCmsoE 8R5ZecUkDXHnYLnRtKKgh/CWRecWemiPBhTUYHS7oCxGjZtbhuP2ncb59pplKXFZZXYa ecpQ== X-Gm-Message-State: AOJu0YxXQpPTlL2MEDdB0973Pc2sm2kJakcX7uaPoSXzPt0KHtB4rm5T 9QqtrKR220WzCYRjInaEFgP1F8jnOP4W3NnvHYhU4uH0VZoAa9KWdVMNv3Xilg== X-Google-Smtp-Source: AGHT+IEXyIAOOf4xSM/knEWBk5Bz8AKDMp6e4oNNbfdQXUXRuwvvpKL6bx30XfSGGei1monTaJaPKA== X-Received: by 2002:a17:90a:5802:b0:2c9:5c67:dd9e with SMTP id 98e67ed59e1d1-2dad50139d8mr7589256a91.19.1725732706324; Sat, 07 Sep 2024 11:11:46 -0700 (PDT) Received: from google.com ([2600:1702:a20:5750::48]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2db0413685bsm1555756a91.3.2024.09.07.11.11.45 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sat, 07 Sep 2024 11:11:45 -0700 (PDT) Date: Sat, 7 Sep 2024 11:11:43 -0700 From: Noah Misch To: Alexander Lakhin Cc: pgsql-hackers Subject: Re: Yet another way for pg_ctl stop to fail on Windows Message-ID: <20240907181143.11.nmisch@google.com> References: MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/2.2.12 (2023-09-09) List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Sat, Sep 07, 2024 at 03:00:00PM +0300, Alexander Lakhin wrote: > With extra logging added, I got: > ### Stopping node "CIC_2PC_test" using mode fast > # Running: pg_ctl -D C:\src\postgresql\build/testrun/amcheck_3/003_cic_2pc\data/t_003_cic_2pc_CIC_2PC_test_data/pgdata > -m fast stop > waiting for server to shut down......!!!pgkill| GetLastError(): 231 > postmaster (9596) died untimely? res: -1, errno: 22 >  failed > > Thus, CallNamedPipe() in pgkill() returned ERROR_PIPE_BUSY (All pipe > instances are busy) and it was handled as an unexpected error. > (The error code 231 returned 10 times out of 10 failures of this ilk for > me.) Thanks for discovering that. > Noah, what do you think of handling this error in line with handling of > ERROR_BROKEN_PIPE and ERROR_BAD_PIPE (which was done in 0ea1f2a3a)? > > I tried the following change: >         switch (GetLastError()) >         { >                 case ERROR_BROKEN_PIPE: >                 case ERROR_BAD_PIPE: > +               case ERROR_PIPE_BUSY: > and saw no issues. That would be a strict improvement over returning EINVAL like we do today. We do use PIPE_UNLIMITED_INSTANCES, so I expect the causes of ERROR_PIPE_BUSY are process exit and ENOMEM-like situations. While that change is the best thing if the process is exiting, it could silently drop the signal in ENOMEM-like situations. Consider the following alternative. If sig==0, just return 0 like you propose, because the process isn't completely gone. Otherwise, sleep and retry the signal, like pgwin32_open_handle() retries after certain errors. What do you think of that?