Problem with pcp process

public inbox for [email protected]  
help / color / mirror / Atom feed

Problem with pcp process
2+ messages / 1 participants
[nested] [flat]

* Problem with pcp process
@ 2026-04-25 13:30  Tatsuo Ishii <[email protected]>
  0 siblings, 1 reply; 2+ messages in thread

From: Tatsuo Ishii @ 2026-04-25 13:30 UTC (permalink / raw)
  To: [email protected]

Koshino told me off-list that following script does not work:
-------------------------------------
pgpool_setup -n 3 --no-stop
pg_ctl -D data2 stop
while true
do
    psql -p 11000 -c "show pool_nodes" test
    if [ $? = 0 ];then
	break;
    fi
    sleep 1
done
psql -p 11000 -c "show pool_nodes" test
pcp_recovery_node -p 11001 -n 2;pcp_promote_node -p 11001 -n 2 -s -g
-------------------------------------

pcp_recovery_node reports success but pcp_promote_node just hangs. I
found pcp worker process loops infinitely around line 584 in
pool_detach_node (pcp_worker.c):

	while (!pcp_worker_wakeup_request)
	{
		struct timeval t = {1, 0};

		select(0, NULL, NULL, NULL, &t);
	}

pcp_worker_wakeup_request is a variable supposed to be set to 1 by
SIGUSR2 signal handler. When pgpool main finishes failover requests
from pcp, it sends SIGUSR2 to pcp main process, then it forwards to
pcp worker process, and its signal handler sets the variable to 1. To
find the process id to forward the signal, pcp main process keeps a
list of pids of forked children (pcp worker process) in its local
memory.

Upon failover, pgpool main sends a signal to pcp main process to
request restarting, and pgpool main restarts. Problem is, when pcp
main restarts, it forgets the list of pids. As a result, when pgpool
main sends SIGUSR2 to pcp main, it cannot find the pid to send the
signal to, which causes the infinite loop in pcp worker process.

To fix the problem, we could delay the restarting of pcp main until it
delivers the signal. Unfortunately this does not work, since pgpool
main waits for pcp main process to exit. Thus processing failover does
not proceed in pgpool main.

So I decided to add a new shared memory area to hold the pcp workers
pids as an array. Upon restarting of pcp main process, it reads the
pids from the shared memory into its local memory. When child process
is forked, its pid is added to the shared memory array. When child
process exits, its pid in the array is cleared to 0, representing an
empty slot.

Attached is a patch to implement it.

I also find similar issue with pgpool_setup. For example, pgpool_setup
-n 3 creates 3 PostgreSQL nodes. To create the standbys, pgpool_setup
uses pcp_recovery_node command. The first node creation is fine. But
in the second creation, pcp_recovery_node actually is timed out (5
seconds). pcp_recovery_node also has a similar loop above. However the
loop is timed out, instead of infinite looping. As a result, the
second pcp_recovery_node looks as if suceeded, just takes longer time
(5 seconds). The patch also fixed the case: now the second
pcp_recovery_node finishes quickly.

Regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp


Attachments:

  [text/x-patch] pcp_child_sigusr2_fix.patch (3.6K, 2-pcp_child_sigusr2_fix.patch)
  download | inline diff:
diff --git a/src/include/pool.h b/src/include/pool.h
index 65907dcf1..fea5744f3 100644
--- a/src/include/pool.h
+++ b/src/include/pool.h
@@ -487,6 +487,13 @@ typedef struct
 	int			count;			/* request node ids count */
 } POOL_REQUEST_NODE;
 
+/*
+ * Maximum number of pcp worker child process.  * Since pcp worker process is
+ * forked whenever failover/failback request is made, it should be equal to
+ * MAX_REQUEST_QUEUE_SIZE + some room. 10 is an arbitrary number.
+*/
+#define	MAX_PCP_WORKER_PIDS		MAX_REQUEST_QUEUE_SIZE + 10
+
 typedef struct
 {
 	POOL_REQUEST_NODE request[MAX_REQUEST_QUEUE_SIZE];
@@ -524,6 +531,12 @@ typedef struct
 	bool		query_cache_invalidate_request; /* true if
 												 * pcp_invalidate_query_cache
 												 * requested */
+
+	/*
+	 * pcp worker child pids. This is inherited to new pcp main process to
+	 * track pcp worker child when new pcp worker child starts.
+	 */
+	pid_t		pcp_worker_pids[MAX_PCP_WORKER_PIDS];
 } POOL_REQUEST_INFO;
 
 /* description of row. corresponding to RowDescription message */
diff --git a/src/main/pgpool_main.c b/src/main/pgpool_main.c
index 32bcb0a1f..4112074e2 100644
--- a/src/main/pgpool_main.c
+++ b/src/main/pgpool_main.c
@@ -3200,6 +3200,8 @@ initialize_shared_mem_objects(bool clear_memcache_oidmaps)
 		wd_ipc_initialize_data();
 	}
 
+	/* initialize pcp worker child pids */
+	memset(Req_info->pcp_worker_pids, 0, sizeof(Req_info->pcp_worker_pids));
 }
 
 /*
diff --git a/src/pcp_con/pcp_child.c b/src/pcp_con/pcp_child.c
index e07c8897e..fc1cda311 100644
--- a/src/pcp_con/pcp_child.c
+++ b/src/pcp_con/pcp_child.c
@@ -154,6 +154,16 @@ pcp_main(int *fds)
 	/* We can now handle ereport(ERROR) */
 	PG_exception_stack = &local_sigjmp_buf;
 
+	/*
+	 * Restore pcp woker child pids from shmem
+	 */
+	for (int i = 0; i < MAX_PCP_WORKER_PIDS; i++)
+	{
+		pid_t	pid = Req_info->pcp_worker_pids[i];
+		if (pid != 0)
+			pcp_worker_children = lappend_int(pcp_worker_children, (int) pid);
+	}
+
 	/*
 	 * Unblock signals
 	 */
@@ -326,6 +336,7 @@ start_pcp_command_processor_process(int port, int *fds)
 	}
 	else						/* parent */
 	{
+		int		i;
 		if (pool_config->log_pcp_processes)
 			ereport(LOG,
 					(errmsg("forked new pcp worker, pid=%d socket=%d",
@@ -334,6 +345,18 @@ start_pcp_command_processor_process(int port, int *fds)
 		close(port);
 		/* Add it to the list */
 		pcp_worker_children = lappend_int(pcp_worker_children, (int) pid);
+		/* save it to shmem */
+		for (i = 0; i < MAX_PCP_WORKER_PIDS; i++)
+		{
+			if (Req_info->pcp_worker_pids[i] == 0)
+			{
+				Req_info->pcp_worker_pids[i] = pid;
+				break;
+			}
+		}
+		if (i == MAX_PCP_WORKER_PIDS)
+			ereport(WARNING,
+					(errmsg("no empty slot in pcp worker table")));
 	}
 }
 
@@ -378,6 +401,16 @@ reaper(void)
 
 	while ((pid = pool_waitpid(&status)) > 0)
 	{
+		/* remove the pid from shmem */
+		for (int i = 0; i < MAX_PCP_WORKER_PIDS; i++)
+		{
+			if (Req_info->pcp_worker_pids[i] == pid)
+			{
+				Req_info->pcp_worker_pids[i] = 0;
+				break;
+			}
+		}
+
 		if (WIFEXITED(status))
 		{
 			if (WEXITSTATUS(status) == POOL_EXIT_FATAL)
@@ -405,6 +438,17 @@ reaper(void)
 				(errmsg("going to remove pid: %d from pid list having %d elements", pid, list_length(pcp_worker_children))));
 		/* remove the pid of process from the list */
 		pcp_worker_children = list_delete_int(pcp_worker_children, pid);
+
+		/* remove the pid from shmem */
+		for (int i = 0; i < MAX_PCP_WORKER_PIDS; i++)
+		{
+			if (Req_info->pcp_worker_pids[i] == pid)
+			{
+				Req_info->pcp_worker_pids[i] = 0;
+				break;
+			}
+		}
+
 		ereport(DEBUG2,
 				(errmsg("new list have %d elements", list_length(pcp_worker_children))));
 	}


^ permalink  raw  reply  [nested|flat] 2+ messages in thread

* Re: Problem with pcp process
@ 2026-04-26 05:56  Tatsuo Ishii <[email protected]>
  parent: Tatsuo Ishii <[email protected]>
  0 siblings, 0 replies; 2+ messages in thread

From: Tatsuo Ishii @ 2026-04-26 05:56 UTC (permalink / raw)
  To: [email protected]

> Koshino told me off-list that following script does not work:
> -------------------------------------
> pgpool_setup -n 3 --no-stop
> pg_ctl -D data2 stop
> while true
> do
>     psql -p 11000 -c "show pool_nodes" test
>     if [ $? = 0 ];then
> 	break;
>     fi
>     sleep 1
> done
> psql -p 11000 -c "show pool_nodes" test
> pcp_recovery_node -p 11001 -n 2;pcp_promote_node -p 11001 -n 2 -s -g
> -------------------------------------
> 
> pcp_recovery_node reports success but pcp_promote_node just hangs. I
> found pcp worker process loops infinitely around line 584 in
> pool_detach_node (pcp_worker.c):
> 
> 	while (!pcp_worker_wakeup_request)
> 	{
> 		struct timeval t = {1, 0};
> 
> 		select(0, NULL, NULL, NULL, &t);
> 	}
> 
> pcp_worker_wakeup_request is a variable supposed to be set to 1 by
> SIGUSR2 signal handler. When pgpool main finishes failover requests
> from pcp, it sends SIGUSR2 to pcp main process, then it forwards to
> pcp worker process, and its signal handler sets the variable to 1. To
> find the process id to forward the signal, pcp main process keeps a
> list of pids of forked children (pcp worker process) in its local
> memory.
> 
> Upon failover, pgpool main sends a signal to pcp main process to
> request restarting, and pgpool main restarts. Problem is, when pcp
> main restarts, it forgets the list of pids. As a result, when pgpool
> main sends SIGUSR2 to pcp main, it cannot find the pid to send the
> signal to, which causes the infinite loop in pcp worker process.
> 
> To fix the problem, we could delay the restarting of pcp main until it
> delivers the signal. Unfortunately this does not work, since pgpool
> main waits for pcp main process to exit. Thus processing failover does
> not proceed in pgpool main.
> 
> So I decided to add a new shared memory area to hold the pcp workers
> pids as an array. Upon restarting of pcp main process, it reads the
> pids from the shared memory into its local memory. When child process
> is forked, its pid is added to the shared memory array. When child
> process exits, its pid in the array is cleared to 0, representing an
> empty slot.
> 
> Attached is a patch to implement it.

In the patch there were duplicate for loops in
pcp_child.c:reaper(). Attached v2 patch removes it. Also update
copright year of pcp_child.c.

Regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp


Attachments:

  [application/octet-stream] v2-0001-Fix-pcp-main-process-to-remember-child-pids-upon-.patch (5.7K, 2-v2-0001-Fix-pcp-main-process-to-remember-child-pids-upon-.patch)
  download | inline diff:
From 202490e15daca3b9ed67a3672ced09efce4f5cfa Mon Sep 17 00:00:00 2001
From: Tatsuo Ishii <[email protected]>
Date: Sun, 26 Apr 2026 14:45:11 +0900
Subject: [PATCH v2] Fix pcp main process to remember child pids upon
 restarting.

Upon failove/failback, pgpool main process sends a signal to pcp main
process to request restarting, and pcp main restarts. Problem is, when
pcp main restarts, it forgets the list of pids of its child pcp worker
process. As a result, when pgpool main sends SIGUSR2 to pcp main to
wake up pcp worker process, it cannot find the pid to send the signal
to, which causes the infinite loop in pcp worker process.

Below is a reproducer script. pcp_promote_node hangs.
------------------------------------------------
pgpool_setup -n 3 --no-stop
pg_ctl -D data2 stop
while true
do
    psql -p 11000 -c "show pool_nodes" test
    if [ $? = 0 ];then
	break;
    fi
    sleep 1
done
psql -p 11000 -c "show pool_nodes" test
pcp_recovery_node -p 11001 -n 2;pcp_promote_node -p 11001 -n 2 -s -g
------------------------------------------------

To fix the problem, we could delay the restarting of pcp main until it
delivers the signal. Unfortunately this does not work, since pgpool
main waits for pcp main process to exit. Thus processing failover does
not proceed in pgpool main.

So I decided to add a new shared memory area
(Req_info->pcp_worker_pids) to remember the pcp workers pids across
pcp main restarting. The array size is MAX_REQUEST_QUEUE_SIZE + some
room (currently 10). When child process is forked from pcp main, its
pid is added to the shared memory array. When child process exits, its
pid in the array is cleared to 0, representing an empty slot.

Author: Tatsuo Ishii <[email protected]>
Reported-by: Taiki Koshino <[email protected]>
Discussion: https://www.postgresql.org/message-id/20260425.223051.1207744844622514060.ishii%40postgresql.org
---
 src/include/pool.h      | 13 +++++++++++++
 src/main/pgpool_main.c  |  2 ++
 src/pcp_con/pcp_child.c | 38 +++++++++++++++++++++++++++++++++++++-
 3 files changed, 52 insertions(+), 1 deletion(-)

diff --git a/src/include/pool.h b/src/include/pool.h
index 65907dcf1..fea5744f3 100644
--- a/src/include/pool.h
+++ b/src/include/pool.h
@@ -487,6 +487,13 @@ typedef struct
 	int			count;			/* request node ids count */
 } POOL_REQUEST_NODE;
 
+/*
+ * Maximum number of pcp worker child process.  * Since pcp worker process is
+ * forked whenever failover/failback request is made, it should be equal to
+ * MAX_REQUEST_QUEUE_SIZE + some room. 10 is an arbitrary number.
+*/
+#define	MAX_PCP_WORKER_PIDS		MAX_REQUEST_QUEUE_SIZE + 10
+
 typedef struct
 {
 	POOL_REQUEST_NODE request[MAX_REQUEST_QUEUE_SIZE];
@@ -524,6 +531,12 @@ typedef struct
 	bool		query_cache_invalidate_request; /* true if
 												 * pcp_invalidate_query_cache
 												 * requested */
+
+	/*
+	 * pcp worker child pids. This is inherited to new pcp main process to
+	 * track pcp worker child when new pcp worker child starts.
+	 */
+	pid_t		pcp_worker_pids[MAX_PCP_WORKER_PIDS];
 } POOL_REQUEST_INFO;
 
 /* description of row. corresponding to RowDescription message */
diff --git a/src/main/pgpool_main.c b/src/main/pgpool_main.c
index 32bcb0a1f..4112074e2 100644
--- a/src/main/pgpool_main.c
+++ b/src/main/pgpool_main.c
@@ -3200,6 +3200,8 @@ initialize_shared_mem_objects(bool clear_memcache_oidmaps)
 		wd_ipc_initialize_data();
 	}
 
+	/* initialize pcp worker child pids */
+	memset(Req_info->pcp_worker_pids, 0, sizeof(Req_info->pcp_worker_pids));
 }
 
 /*
diff --git a/src/pcp_con/pcp_child.c b/src/pcp_con/pcp_child.c
index e07c8897e..23d233146 100644
--- a/src/pcp_con/pcp_child.c
+++ b/src/pcp_con/pcp_child.c
@@ -5,7 +5,7 @@
  * pgpool: a language independent connection pool server for PostgreSQL
  * written by Tatsuo Ishii
  *
- * Copyright (c) 2003-2023	PgPool Global Development Group
+ * Copyright (c) 2003-2026	PgPool Global Development Group
  *
  * Permission to use, copy, modify, and distribute this software and
  * its documentation for any purpose and without fee is hereby
@@ -154,6 +154,17 @@ pcp_main(int *fds)
 	/* We can now handle ereport(ERROR) */
 	PG_exception_stack = &local_sigjmp_buf;
 
+	/*
+	 * Restore pcp woker child pids from shmem
+	 */
+	for (int i = 0; i < MAX_PCP_WORKER_PIDS; i++)
+	{
+		pid_t		pid = Req_info->pcp_worker_pids[i];
+
+		if (pid != 0)
+			pcp_worker_children = lappend_int(pcp_worker_children, (int) pid);
+	}
+
 	/*
 	 * Unblock signals
 	 */
@@ -326,6 +337,8 @@ start_pcp_command_processor_process(int port, int *fds)
 	}
 	else						/* parent */
 	{
+		int			i;
+
 		if (pool_config->log_pcp_processes)
 			ereport(LOG,
 					(errmsg("forked new pcp worker, pid=%d socket=%d",
@@ -334,6 +347,18 @@ start_pcp_command_processor_process(int port, int *fds)
 		close(port);
 		/* Add it to the list */
 		pcp_worker_children = lappend_int(pcp_worker_children, (int) pid);
+		/* save it to shmem */
+		for (i = 0; i < MAX_PCP_WORKER_PIDS; i++)
+		{
+			if (Req_info->pcp_worker_pids[i] == 0)
+			{
+				Req_info->pcp_worker_pids[i] = pid;
+				break;
+			}
+		}
+		if (i == MAX_PCP_WORKER_PIDS)
+			ereport(WARNING,
+					(errmsg("no empty slot in pcp worker table")));
 	}
 }
 
@@ -405,6 +430,17 @@ reaper(void)
 				(errmsg("going to remove pid: %d from pid list having %d elements", pid, list_length(pcp_worker_children))));
 		/* remove the pid of process from the list */
 		pcp_worker_children = list_delete_int(pcp_worker_children, pid);
+
+		/* remove the pid from shmem */
+		for (int i = 0; i < MAX_PCP_WORKER_PIDS; i++)
+		{
+			if (Req_info->pcp_worker_pids[i] == pid)
+			{
+				Req_info->pcp_worker_pids[i] = 0;
+				break;
+			}
+		}
+
 		ereport(DEBUG2,
 				(errmsg("new list have %d elements", list_length(pcp_worker_children))));
 	}
-- 
2.43.0



^ permalink  raw  reply  [nested|flat] 2+ messages in thread

end of thread, other threads:[~2026-04-26 05:56 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2026-04-25 13:30 Problem with pcp process Tatsuo Ishii <[email protected]>
2026-04-26 05:56 ` Tatsuo Ishii <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox