public inbox for [email protected]
help / color / mirror / Atom feedFrom: Fernando Hevia <[email protected]>
To: Tayyab Fayyaz <[email protected]>
Cc: Chris Lee <[email protected]>
Cc: Imran Khan <[email protected]>
Cc: pgsql-admin <[email protected]>
Subject: Re: repmgr cannot bring up the standby database after switchover manaully
Date: Wed, 1 Oct 2025 17:03:16 -0300
Message-ID: <CAGYT1XTZOMQ88zfN7T76RJTam47XyX-U1V_CqTTVhk2f5LVObA@mail.gmail.com> (raw)
In-Reply-To: <CAFVRaQ0zz-81jReYtOEo2pHXD0GzZ5sr6=mL7nOQ_EBZkRh8MA@mail.gmail.com>
References: <CAP0GqH7hYsFGmLg3Ge-_Lf_VEeES8yeHpOhSvn071=XRnmVxTw@mail.gmail.com>
<CAFVRaQ3No2rEYjkuKsxNNMCP16_J_kjDm_pv0+CYwTLXsjE8OQ@mail.gmail.com>
<CAC4eXDgNZM12t8OgKYhjbu_7NFcsu0qG5aTpdXmbN=fmmp-5Tw@mail.gmail.com>
<CAP0GqH7oRXztf1Ti13usH9CEgBmMJ4mPMsOn+qacD8Adi+Dxew@mail.gmail.com>
<CAGYT1XRLUstysEBwWKNVVWrUJCfahCFgmzEkFwpjQmABOaMocg@mail.gmail.com>
<CAFVRaQ0zz-81jReYtOEo2pHXD0GzZ5sr6=mL7nOQ_EBZkRh8MA@mail.gmail.com>
> In my recent experience, there was no issue starting the old primary—it
> came up normally. However, it resulted in a split-brain situation where the
> old primary continued to accept both read and write operations while still
> assuming the other two nodes were replicas.
Hi Tayyab,
A split-brain is definitely an unexpected behavior. After issuing a
failover or switchover command, always check the exit code to ensure it was
successful. If not, you should find in the command output or in the
postgresql logs an indication of what went wrong.
Seems that either the previous primary couldn't be shutdown or repmgr
failed somehow to change it to a standby. Repmgr sets the node's role by
creating the standby.signal file in the data directory. Upon startup, if
Postgres finds the signal file, it will assume the standby role (providing
the postgresql.conf file has the correct configuration too). I can only
theorize here, but maybe repmgr failed to write the signal file in $PGDATA
either due to lack of permissions or a network failure.
The exact output would help in figuring out what went wrong.
Regards,
Fernando
El mié, 1 oct 2025 a la(s) 4:04 p.m., Tayyab Fayyaz ([email protected])
escribió:
> Hello Fernando,
>
> In my recent experience, there was no issue starting the old primary—it
> came up normally. However, it resulted in a split-brain situation where the
> old primary continued to accept both read and write operations while still
> assuming the other two nodes were replicas.
>
> This issue occurred with the following environment:
>
> -
>
> *OS version:* RHEL 8.10
> -
>
> *Postgres DB version:* 14.9
> -
>
> *repmgr version:* 5.5.0
>
> Tayyab
>
> On Wed, Oct 1, 2025 at 11:52 AM Fernando Hevia <[email protected]> wrote:
>
>>
>> I have 2 postgresql servers. One is the primary and another one is the
>>> standby. I am trying to setup repmgr to do the switchover manually.
>>> Passwordless ssh have been setup for postgres ID on both servers.
>>>
>>> I use this command "repmgr standby switchover --log-level=DEBUG
>>> --verbose". The standy database is able to promote to be the primary. For
>>> the previous primary database, it was shutdown. It was not able to bring up
>>> as standby by repmgr.
>>
>>
>> In a switchover the primary server is shutdown and restarted as a standby
>> server after the newly promoted primary (former secondary) node has been
>> started.
>> If the primary did not start, there must have been an issue since this is
>> not the standard behavior for a switchover command.
>>
>> Have you checked the Postgres log file for the previous primary? You
>> should find the startup failure cause in the log.
>>
>> Regards,
>> Fernando
>>
>>
>>
>> El mié, 1 oct 2025 a la(s) 7:30 a.m., Chris Lee ([email protected])
>> escribió:
>>
>>> Hi Tayyab,
>>>
>>> Thanks for your information . I also want to find out whether that is
>>> the default behavior, or I am not configuring repmgr correctly.
>>>
>>> Regards,
>>> Chris
>>>
>>> On Wed, 1 Oct 2025, 18:12 Imran Khan, <[email protected]> wrote:
>>>
>>>> Hi Tayyab,
>>>>
>>>> Is this a default behavior? We have 4 nodes cluster but never had
>>>> issue in switchovers.
>>>>
>>>> Thanks,
>>>> Imran
>>>>
>>>> On Wed, Oct 1, 2025, 1:10 PM Tayyab Fayyaz <[email protected]>
>>>> wrote:
>>>>
>>>>> Hello Chris,
>>>>>
>>>>> I faced this issue it will not add automatically as standby you have
>>>>> to add it manually.
>>>>>
>>>>> But I wrote a script which perform to add old primary as standby once
>>>>> it's back online.
>>>>>
>>>>> Tayyab
>>>>>
>>>>>
>>>>> On Wed, 1 Oct 2025, 3:02 pm Chris Lee, <[email protected]> wrote:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> I have 2 postgresql servers. One is the primary and another one is
>>>>>> the standby. I am trying to setup repmgr to do the switchover manually.
>>>>>> Passwordless ssh have been setup for postgres ID on both servers.
>>>>>>
>>>>>> I use this command "repmgr standby switchover --log-level=DEBUG
>>>>>> --verbose". The standy database is able to promote to be the primary. For
>>>>>> the previous primary database, it was shutdown. It was not able to bring up
>>>>>> as standby by repmgr.
>>>>>>
>>>>>> Does anyone encounter this issue before? Thanks a lot for any
>>>>>> suggestions.
>>>>>>
>>>>>> Here is my OS and DB versions:
>>>>>>
>>>>>> OS version: CentOS Stream release 8
>>>>>> Postgres DB version: 15.12
>>>>>> rempmgr version: 5.5.0
>>>>>>
>>>>>> Here is the repmgr conf files:
>>>>>> >>>>>
>>>>>> node_id=1 # Use 2 on standby
>>>>>> node_name='primary'
>>>>>> conninfo='host=centos804 user=repmgr dbname=repmgr password=xxx
>>>>>> connect_timeout=15'
>>>>>> use_primary_conninfo_password=true
>>>>>> data_directory='/var/lib/pgsql/15/data' # Adjust for your setup
>>>>>> pg_bindir='/usr/pgsql-15/bin'
>>>>>> service_start_command = 'sudo systemctl start postgresql-15'
>>>>>> service_stop_command = 'sudo systemctl stop postgresql-15'
>>>>>> <<<<<
>>>>>>
>>>>>> >>>>>
>>>>>> node_id=2 # Use 2 on standby
>>>>>> node_name='standby'
>>>>>> conninfo='host=centos803 user=repmgr dbname=repmgr password=xxx
>>>>>> connect_timeout=15'
>>>>>> use_primary_conninfo_password=true
>>>>>> data_directory='/var/lib/pgsql/15/data' # Adjust for your setup
>>>>>> pg_bindir='/usr/pgsql-15/bin'
>>>>>> service_start_command = 'sudo systemctl start postgresql-15'
>>>>>> service_stop_command = 'sudo systemctl stop postgresql-15'
>>>>>> <<<<<
>>>>>>
>>>>>> Regards,
>>>>>> Chris
>>>>>>
>>>>>
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected]
Subject: Re: repmgr cannot bring up the standby database after switchover manaully
In-Reply-To: <CAGYT1XTZOMQ88zfN7T76RJTam47XyX-U1V_CqTTVhk2f5LVObA@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox