rebooting a standby causes it go down on pgpool side

public inbox for [email protected]  
help / color / mirror / Atom feed

rebooting a standby causes it go down on pgpool side
3+ messages / 2 participants
[nested] [flat]

* rebooting a standby causes it go down on pgpool side
@ 2025-10-30 10:33  Luca Ferrari <[email protected]>
  0 siblings, 1 reply; 3+ messages in thread

From: Luca Ferrari @ 2025-10-30 10:33 UTC (permalink / raw)
  To: [email protected]

Hi all,
I've three identical machines, running pgpool 4.6.2, postgres 17, one
primary (pg1) and two standby pg2 and pg3.

The initial situation is:

 % ssh pg1 'sudo -u postgres pcp_node_info -U pgpool'
pg1 5432 1 0.166667 waiting up primary primary 0 none none 2025-10-30 11:01:05
pg2 5432 1 0.333333 waiting up standby standby 0 streaming async
2025-10-30 11:01:05
pg3 5432 1 0.500000 waiting up standby standby 0 streaming async
2025-10-30 11:01:05


 % ssh pg2 'sudo -u postgres pcp_node_info -U pgpool'
pg1 5432 1 0.166667 waiting up primary primary 0 none none 2025-10-30 11:01:10
pg2 5432 1 0.333333 waiting up standby standby 0 streaming async
2025-10-30 11:01:10
pg3 5432 1 0.500000 waiting up standby standby 0 streaming async
2025-10-30 11:01:10

 % ssh pg3 'sudo -u postgres pcp_node_info -U pgpool'
pg1 5432 1 0.166667 waiting up primary primary 0 none none 2025-10-30 11:01:10
pg2 5432 1 0.333333 waiting up standby standby 0 streaming async
2025-10-30 11:01:10
pg3 5432 1 0.500000 waiting up standby standby 0 streaming async
2025-10-30 11:01:10


I don't know why, the machines are reported as "waiting up" even if
eveything is working fine and I've waited several minutes without
having any change in the status.

Now, if I reboot a standby, let's say node 3, it comes up with a "down
up" status and the only mode I've to make pgpool see the node as
healthy again is to run pcp_attach_node.


% ssh pg3 'sudo reboot'

 % ssh pg1 'sudo -u postgres pcp_node_info -U pgpool'
pg1 5432 1 0.166667 waiting up primary primary 0 none none 2025-10-30 11:01:05
pg2 5432 1 0.333333 waiting up standby standby 0 streaming async
2025-10-30 11:01:05
pg3 5432 3 0.500000 down up standby standby 0 streaming async
2025-10-30 11:08:21

 % ssh pg2 'sudo -u postgres pcp_node_info -U pgpool'
pg1 5432 1 0.166667 waiting up primary primary 0 none none 2025-10-30 11:01:10
pg2 5432 1 0.333333 waiting up standby standby 0 streaming async
2025-10-30 11:01:10
pg3 5432 3 0.500000 down up standby standby 0 streaming async
2025-10-30 11:08:21

 % ssh pg3 'sudo -u postgres pcp_node_info -U pgpool'
pg1 5432 1 0.166667 waiting up primary primary 0 none none 2025-10-30 11:09:08
pg2 5432 1 0.333333 waiting up standby standby 0 streaming async
2025-10-30 11:09:08
pg3 5432 3 0.500000 down up standby standby 0 streaming async
2025-10-30 11:09:08

Is this normal? Because the node is streaming regularly, so it is fine
on the postgres side and it should also take over its last status at
boot (i.e, at least waiting).

Anything I should dig for?

Thanks,
Luca





^ permalink  raw  reply  [nested|flat] 3+ messages in thread

* Re: rebooting a standby causes it go down on pgpool side
@ 2025-11-04 02:32  Tatsuo Ishii <[email protected]>
  parent: Luca Ferrari <[email protected]>
  0 siblings, 1 reply; 3+ messages in thread

From: Tatsuo Ishii @ 2025-11-04 02:32 UTC (permalink / raw)
  To: [email protected]; +Cc: [email protected]

> Hi all,
> I've three identical machines, running pgpool 4.6.2, postgres 17, one
> primary (pg1) and two standby pg2 and pg3.
> 
> The initial situation is:
> 
>  % ssh pg1 'sudo -u postgres pcp_node_info -U pgpool'
> pg1 5432 1 0.166667 waiting up primary primary 0 none none 2025-10-30 11:01:05
> pg2 5432 1 0.333333 waiting up standby standby 0 streaming async
> 2025-10-30 11:01:05
> pg3 5432 1 0.500000 waiting up standby standby 0 streaming async
> 2025-10-30 11:01:05
> 
> 
>  % ssh pg2 'sudo -u postgres pcp_node_info -U pgpool'
> pg1 5432 1 0.166667 waiting up primary primary 0 none none 2025-10-30 11:01:10
> pg2 5432 1 0.333333 waiting up standby standby 0 streaming async
> 2025-10-30 11:01:10
> pg3 5432 1 0.500000 waiting up standby standby 0 streaming async
> 2025-10-30 11:01:10
> 
>  % ssh pg3 'sudo -u postgres pcp_node_info -U pgpool'
> pg1 5432 1 0.166667 waiting up primary primary 0 none none 2025-10-30 11:01:10
> pg2 5432 1 0.333333 waiting up standby standby 0 streaming async
> 2025-10-30 11:01:10
> pg3 5432 1 0.500000 waiting up standby standby 0 streaming async
> 2025-10-30 11:01:10
> 
> 
> I don't know why, the machines are reported as "waiting up" even if
> eveything is working fine and I've waited several minutes without
> having any change in the status.



^ permalink  raw  reply  [nested|flat] 3+ messages in thread

* Re: rebooting a standby causes it go down on pgpool side
@ 2025-11-06 08:06  Luca Ferrari <[email protected]>
  parent: Tatsuo Ishii <[email protected]>
  0 siblings, 0 replies; 3+ messages in thread

From: Luca Ferrari @ 2025-11-06 08:06 UTC (permalink / raw)
  To: Tatsuo Ishii <[email protected]>; +Cc: [email protected]

On Tue, Nov 4, 2025 at 3:32 AM Tatsuo Ishii <[email protected]> wrote:
>
> From the official document:
> "2 - Node is up. Connections are pooled."
> https://www.pgpool.net/docs/latest/en/html/pcp-node-info.html
>
> So you can safely assume that "waiting" also meabs node is up (for
> pgpool).  To change the status from "waiting" to "up", you can issue
> any SQL command (for example show pool_nodes") to pgpool.
>

Thanks, it is clear now!

> I think what happens there is, pgpool's health check detects
> PostgreSQL on pg3 goes down and trigger failover on pg3 pgpool.  As
> stated somewhere in the official docs, pgpool automatically sets
> PostgreSQL to down status, but not automatically change the status to
> up (or waiting). This is not a bug. It's by design.  If you want to
> make the status to up automatically, please consider using
> auto_failback.

Thanks, I was not assuming it was a bug, only asking for confirmation
about this behavior.

>
> If you want to avoid the failover, you can tweak the health check
> parameters so that while PostgreSQL rebooting, the health check
> repeatedly retries.  For example, increase health_check_max_retries.
> Note that health_check parameters can be changed by reloading
> pgpool.conf. So you can change the parameter and restore to previous
> value without rebooting pgpool.

Definetely a choice, but on the other hand increasing the number of
retries will make pgpool unable to "quickly" catch a real failover, so
I guess it is better, in my case, to deal with the "down" status after
a reboot.

Thanks,
Luca






^ permalink  raw  reply  [nested|flat] 3+ messages in thread

end of thread, other threads:[~2025-11-06 08:06 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2025-10-30 10:33 rebooting a standby causes it go down on pgpool side Luca Ferrari <[email protected]>
2025-11-04 02:32 ` Tatsuo Ishii <[email protected]>
2025-11-06 08:06   ` Luca Ferrari <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox