public inbox for [email protected]  
help / color / mirror / Atom feed
From: Alexander Lakhin <[email protected]>
To: Noah Misch <[email protected]>
Cc: pgsql-hackers <[email protected]>
Subject: Re: Yet another way for pg_ctl stop to fail on Windows
Date: Sun, 8 Sep 2024 18:00:00 +0300
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
	<[email protected]>

07.09.2024 21:11, Noah Misch wrote:

>
>> Noah, what do you think of handling this error in line with handling of
>> ERROR_BROKEN_PIPE and ERROR_BAD_PIPE (which was done in 0ea1f2a3a)?
>>
>> I tried the following change:
>>          switch (GetLastError())
>>          {
>>                  case ERROR_BROKEN_PIPE:
>>                  case ERROR_BAD_PIPE:
>> +               case ERROR_PIPE_BUSY:
>> and saw no issues.
> That would be a strict improvement over returning EINVAL like we do today.  We
> do use PIPE_UNLIMITED_INSTANCES, so I expect the causes of ERROR_PIPE_BUSY are
> process exit and ENOMEM-like situations.  While that change is the best thing
> if the process is exiting, it could silently drop the signal in ENOMEM-like
> situations.  Consider the following alternative.  If sig==0, just return 0
> like you propose, because the process isn't completely gone.  Otherwise, sleep
> and retry the signal, like pgwin32_open_handle() retries after certain errors.
> What do you think of that?

Thank you for your attention to the issue!

I agree with your approach. It looks like Microsoft recommends to loop on
ERROR_PIPE_BUSY: [1] (they say "Calling CallNamedPipe is equivalent to
calling the CreateFile ..." at [2]).

So if we aim to not only fix "pg_ctl stop", but to make pgkill() robust,
it's the way to go, IMHO. I'm not sure about an infinite loop they show,
I'd vote for a loop with the same characteristics as in
pgwin32_open_handle().

I've managed to trigger ERROR_PIPE_BUSY with "pg_ctl reload", when running
20 TAP tests (see attached) in parallel (with 20 threads executing
$node->reload() in each), and with the kill() call inside do_reload()
repeated x100 (see the modification patch attached too):
...
# Running: pg_ctl -D .../099_test_pgkill\data/t_099_test_pgkill_node_data/pgdata reload
### Reloading node "node"
# Running: pg_ctl -D .../099_test_pgkill\data/t_099_test_pgkill_node_data/pgdata reload
[13:41:46.850](2.400s) # 18
server signaled
server signaled
server signaled
server signaled
server signaled
server signaled
server signaled
server signaled
!!!pgkill| GetLastError(): 231
pg_ctl: could not send reload signal (PID: 3912), iteration: 81, res: -1, errno: 22: Invalid argument
server signaled
[13:41:52.594](5.744s) # 19
...

[1] https://learn.microsoft.com/en-us/windows/win32/ipc/named-pipe-client
[2] https://learn.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-callnamedpipea

Best regards,
Alexander

Attachments:

  [text/x-patch] pgkill-debugging.patch (1.1K, 2-pgkill-debugging.patch)
  download | inline diff:
diff --git a/src/bin/pg_ctl/pg_ctl.c b/src/bin/pg_ctl/pg_ctl.c
index e7e878c22f..cab6eb9472 100644
--- a/src/bin/pg_ctl/pg_ctl.c
+++ b/src/bin/pg_ctl/pg_ctl.c
@@ -1167,12 +1174,17 @@ do_reload(void)
 		exit(1);
 	}
 
-	if (kill(pid, sig) != 0)
+for (int i = 1; i <= 100; i++)
+{
+int res, en;
+	if ((res = kill(pid, sig)) != 0)
 	{
-		write_stderr(_("%s: could not send reload signal (PID: %d): %m\n"),
-					 progname, (int) pid);
+en = errno;
+		write_stderr(_("%s: could not send reload signal (PID: %d), iteration: %d, res: %d, errno: %d: %m\n"),
+					 progname, (int) pid, i, res, en);
 		exit(1);
 	}
+}
 
 	print_msg(_("server signaled\n"));
 }
diff --git a/src/port/kill.c b/src/port/kill.c
index 412c2f19c1..b3d2c00796 100644
--- a/src/port/kill.c
+++ b/src/port/kill.c
@@ -70,6 +70,7 @@ pgkill(int pid, int sig)
 		return 0;
 	}
 
+int le = GetLastError();
 	switch (GetLastError())
 	{
 		case ERROR_BROKEN_PIPE:
@@ -89,6 +90,7 @@ pgkill(int pid, int sig)
 			errno = EPERM;
 			return -1;
 		default:
+fprintf(stderr, "!!!pgkill| GetLastError(): %d\n", le);
 			errno = EINVAL;		/* unexpected */
 			return -1;
 	}


  [application/x-perl] 099_test_pgkill.pl (1.5K, 3-099_test_pgkill.pl)
  download

view thread (3+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected]
  Subject: Re: Yet another way for pg_ctl stop to fail on Windows
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox