Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1v4gCT-004Ihc-6j for pgsql-admin@arkaria.postgresql.org; Fri, 03 Oct 2025 13:51:57 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1v4gCR-00DaFw-7c for pgsql-admin@arkaria.postgresql.org; Fri, 03 Oct 2025 13:51:55 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1v4gCQ-00DaFn-Ke for pgsql-admin@lists.postgresql.org; Fri, 03 Oct 2025 13:51:55 +0000 Received: from mail-yw1-x112f.google.com ([2607:f8b0:4864:20::112f]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1v4gCO-001DOx-0Q for pgsql-admin@postgresql.org; Fri, 03 Oct 2025 13:51:54 +0000 Received: by mail-yw1-x112f.google.com with SMTP id 00721157ae682-72e565bf2f0so25592617b3.3 for ; Fri, 03 Oct 2025 06:51:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1759499512; x=1760104312; darn=postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=gfcOgFl5n7vneZU70ghHRi1oZxWgQM+Z22pcXHtIFP4=; b=QQVhb2NVWsIDuB5M/mtKO3ZPQ1chQ/J6I2ynyFcUC7S2+f+m0tWPIvsFhDkJ6oslO7 3eUnvtbtXS4em4JlDJnSpLW8cKDizdzbJbbQHUfY7UvRQYwEtwLfCDcO1c7Ot5C51NJO e2ovEXtkYuzXYEb+hDXA0RBkUIu9MHzz7XwasP1zEoQvtPgD9eyHXKo4TOEVMfVJ45H8 Iv7Mpco3AJoEYULO2P+PVNNKZu/BheETa7hXN4jf0BMLLoVJqEwq+UIdm7qiOxzFEeAb gmy/27qq8UuzpHE4jRwKyq2Xg8ZZ96u7uF6f2WA3Jbv1c0xhPpl0e7a4CfdaoitczyTd CRwQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759499512; x=1760104312; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=gfcOgFl5n7vneZU70ghHRi1oZxWgQM+Z22pcXHtIFP4=; b=edY5jxI3omVYVt+5gNi/gkB7vnqDoiwAHaf5HMR8eGHCbuSdW+UyuA2VipIhv1xjRp 3UcZdCHjfJwINnYjDmbk76+sD52iyP70mUzd5/DnTYTKkyStt7JlLOMFInI5GxdMLMTB JTRjlpZFAveelqb4CUBnzBSY/IeX9//neTOvCMJul7M/YFA41gl47PRmJOWotNEuT+d3 a0qdYdvIVnSm5mx/cNHifE3ZYVla/JZJtOTd3zyVFhY8dWhsFaDRylerPCdF8+ZU/WZf UngULk6Ftjku0Eg/f1ppTklU9On/EYWih0hOzCBDHoiLdntuOhWun2c/TV534AKxkg6i tRGg== X-Forwarded-Encrypted: i=1; AJvYcCUk6ech5PZQ9D5c8lRDlnoj1u9MCQV4OOEcMm70sdqVM7PixkPFMPEgPNsXqvj8RVVAYT35z+GXuZOFAA==@postgresql.org X-Gm-Message-State: AOJu0Yz7gD0ogYriJP4OVsEJizM2lXBt8SxjXgK0eYrZyyhacaqSQmf3 MTjkaOZaX1NcuY8QGN7BP/IOodbGh6q1LZl6IAJ56u9zcYObR1AHwcWYx5QxVw+gly6V39y68Fs ilfOiI5Cxp7Cmfjc8wZ+UUtO2CrM4BmY= X-Gm-Gg: ASbGnctoD+9mXWTOW+QU/TdwdBpqMQDPqhLYZk0quxaa0zdzid06Xi/JGbpa2ZozmKY c7fRmtovS/0cn0zlwzubcBGfexQC5iPYbt1nI7Rn2bvmXOb6Ds3zd3qs/zcFWibvhIRNLeXPIDV uhqhKMEH4HefNKbEwA37O4Mzr92Q/2HgOOuw1WBDFrLvhYy6ATzqoFRSsS/XGZlZPqXzVUVEj/D wjhY+BkVIJjlOw4uwxwZ4Rcdo5eJr1X X-Google-Smtp-Source: AGHT+IGyNkOGzuyIGt449ELfCX0LALR3s4421YMjg1xd6OAuIz+fmfl5SVzMUsRZuSU2bQLIWRaXhcSS2yte3EnGhDo= X-Received: by 2002:a53:d983:0:b0:631:559c:7417 with SMTP id 956f58d0204a3-63b99f3a4c0mr2236343d50.0.1759499511571; Fri, 03 Oct 2025 06:51:51 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Tayyab Fayyaz Date: Fri, 3 Oct 2025 18:51:40 +0500 X-Gm-Features: AS18NWB1DagGv4xw-X7h6fEuA6WKm09mM1RJh40CWbcBy4Bn0ry3Rhlz30K-EkI Message-ID: Subject: Re: repmgr cannot bring up the standby database after switchover manaully To: Pavan Kumar Cc: Fernando Hevia , Chris Lee , Imran Khan , pgsql-admin Content-Type: multipart/alternative; boundary="0000000000005ea8dd06404167a4" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --0000000000005ea8dd06404167a4 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hello Pavan, Please share required parameters for PostgreSQL, I will compare with my existing configuration. As I understand, automatically re-adding the old primary as a standby is not an out-of-the-box feature and needs to be handled manually. Is that correct? Tayyab On Fri, 3 Oct 2025, 6:20=E2=80=AFpm Pavan Kumar, wr= ote: > Hello Chris, > > I hope you configured required parameters in PostgreSQL. I do noticed the > same issue when your primary is Idle (no activity). > Before doing switchover please perform checkpoint on primary and run > switchover command. > review repmgr -f repmgr.conf cluster events , this will provide more > information on what happened during switchover. > > Note: Make sure repmgr daemon are running and not in pause mode before > switchover . > > > > > On Wed, Oct 1, 2025 at 3:03=E2=80=AFPM Fernando Hevia = wrote: > >> >> >>> In my recent experience, there was no issue starting the old primary=E2= =80=94it >>> came up normally. However, it resulted in a split-brain situation where= the >>> old primary continued to accept both read and write operations while st= ill >>> assuming the other two nodes were replicas. >> >> >> Hi Tayyab, >> >> A split-brain is definitely an unexpected behavior. After issuing a >> failover or switchover command, always check the exit code to ensure it = was >> successful. If not, you should find in the command output or in the >> postgresql logs an indication of what went wrong. >> >> Seems that either the previous primary couldn't be shutdown or repmgr >> failed somehow to change it to a standby. Repmgr sets the node's role by >> creating the standby.signal file in the data directory. Upon startup, if >> Postgres finds the signal file, it will assume the standby role (providi= ng >> the postgresql.conf file has the correct configuration too). I can only >> theorize here, but maybe repmgr failed to write the signal file in $PGDA= TA >> either due to lack of permissions or a network failure. >> >> The exact output would help in figuring out what went wrong. >> >> Regards, >> Fernando >> >> >> >> >> >> El mi=C3=A9, 1 oct 2025 a la(s) 4:04=E2=80=AFp.m., Tayyab Fayyaz ( >> tayyab.humayl@gmail.com) escribi=C3=B3: >> >>> Hello Fernando, >>> >>> In my recent experience, there was no issue starting the old primary=E2= =80=94it >>> came up normally. However, it resulted in a split-brain situation where= the >>> old primary continued to accept both read and write operations while st= ill >>> assuming the other two nodes were replicas. >>> >>> This issue occurred with the following environment: >>> >>> - >>> >>> *OS version:* RHEL 8.10 >>> - >>> >>> *Postgres DB version:* 14.9 >>> - >>> >>> *repmgr version:* 5.5.0 >>> >>> Tayyab >>> >>> On Wed, Oct 1, 2025 at 11:52=E2=80=AFAM Fernando Hevia wrote: >>> >>>> >>>> I have 2 postgresql servers. One is the primary and another one is the >>>>> standby. I am trying to setup repmgr to do the switchover manually. >>>>> Passwordless ssh have been setup for postgres ID on both servers. >>>>> >>>>> I use this command "repmgr standby switchover --log-level=3DDEBUG >>>>> --verbose". The standy database is able to promote to be the primary.= For >>>>> the previous primary database, it was shutdown. It was not able to br= ing up >>>>> as standby by repmgr. >>>> >>>> >>>> In a switchover the primary server is shutdown and restarted as a >>>> standby server after the newly promoted primary (former secondary) nod= e has >>>> been started. >>>> If the primary did not start, there must have been an issue since this >>>> is not the standard behavior for a switchover command. >>>> >>>> Have you checked the Postgres log file for the previous primary? You >>>> should find the startup failure cause in the log. >>>> >>>> Regards, >>>> Fernando >>>> >>>> >>>> >>>> El mi=C3=A9, 1 oct 2025 a la(s) 7:30=E2=80=AFa.m., Chris Lee (clee.hk@= gmail.com) >>>> escribi=C3=B3: >>>> >>>>> Hi Tayyab, >>>>> >>>>> Thanks for your information . I also want to find out whether that is >>>>> the default behavior, or I am not configuring repmgr correctly. >>>>> >>>>> Regards, >>>>> Chris >>>>> >>>>> On Wed, 1 Oct 2025, 18:12 Imran Khan, wrote: >>>>> >>>>>> Hi Tayyab, >>>>>> >>>>>> Is this a default behavior? We have 4 nodes cluster but never had >>>>>> issue in switchovers. >>>>>> >>>>>> Thanks, >>>>>> Imran >>>>>> >>>>>> On Wed, Oct 1, 2025, 1:10=E2=80=AFPM Tayyab Fayyaz >>>>>> wrote: >>>>>> >>>>>>> Hello Chris, >>>>>>> >>>>>>> I faced this issue it will not add automatically as standby you hav= e >>>>>>> to add it manually. >>>>>>> >>>>>>> But I wrote a script which perform to add old primary as standby >>>>>>> once it's back online. >>>>>>> >>>>>>> Tayyab >>>>>>> >>>>>>> >>>>>>> On Wed, 1 Oct 2025, 3:02=E2=80=AFpm Chris Lee, = wrote: >>>>>>> >>>>>>>> Hi all, >>>>>>>> >>>>>>>> I have 2 postgresql servers. One is the primary and another one is >>>>>>>> the standby. I am trying to setup repmgr to do the switchover manu= ally. >>>>>>>> Passwordless ssh have been setup for postgres ID on both servers. >>>>>>>> >>>>>>>> I use this command "repmgr standby switchover --log-level=3DDEBUG >>>>>>>> --verbose". The standy database is able to promote to be the prima= ry. For >>>>>>>> the previous primary database, it was shutdown. It was not able to= bring up >>>>>>>> as standby by repmgr. >>>>>>>> >>>>>>>> Does anyone encounter this issue before? Thanks a lot for any >>>>>>>> suggestions. >>>>>>>> >>>>>>>> Here is my OS and DB versions: >>>>>>>> >>>>>>>> OS version: CentOS Stream release 8 >>>>>>>> Postgres DB version: 15.12 >>>>>>>> rempmgr version: 5.5.0 >>>>>>>> >>>>>>>> Here is the repmgr conf files: >>>>>>>> >>>>> >>>>>>>> node_id=3D1 # Use 2 on standby >>>>>>>> node_name=3D'primary' >>>>>>>> conninfo=3D'host=3Dcentos804 user=3Drepmgr dbname=3Drepmgr passwor= d=3Dxxx >>>>>>>> connect_timeout=3D15' >>>>>>>> use_primary_conninfo_password=3Dtrue >>>>>>>> data_directory=3D'/var/lib/pgsql/15/data' # Adjust for your setup >>>>>>>> pg_bindir=3D'/usr/pgsql-15/bin' >>>>>>>> service_start_command =3D 'sudo systemctl start postgresql-15' >>>>>>>> service_stop_command =3D 'sudo systemctl stop postgresql-15' >>>>>>>> <<<<< >>>>>>>> >>>>>>>> >>>>> >>>>>>>> node_id=3D2 # Use 2 on standby >>>>>>>> node_name=3D'standby' >>>>>>>> conninfo=3D'host=3Dcentos803 user=3Drepmgr dbname=3Drepmgr passwor= d=3Dxxx >>>>>>>> connect_timeout=3D15' >>>>>>>> use_primary_conninfo_password=3Dtrue >>>>>>>> data_directory=3D'/var/lib/pgsql/15/data' # Adjust for your setup >>>>>>>> pg_bindir=3D'/usr/pgsql-15/bin' >>>>>>>> service_start_command =3D 'sudo systemctl start postgresql-15' >>>>>>>> service_stop_command =3D 'sudo systemctl stop postgresql-15' >>>>>>>> <<<<< >>>>>>>> >>>>>>>> Regards, >>>>>>>> Chris >>>>>>>> >>>>>>> > > -- > > > > *Regards,#! Pavan Kumar----------------------------------------------*- > *Sr. Database Administrator..!* > *NEXT GENERATION PROFESSIONALS, LLC* > *Cell # 267-799-3182 # pavan.dba27 (Gtalk) * > *India # 9000459083* > > *Take Risks; if you win, you will be very happy. If you lose you will be > Wise * > > --0000000000005ea8dd06404167a4 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hello Pavan,

Please share required parameters for PostgreSQL, I will compare with = my existing configuration.

As I understand, automatically re-adding the old primary as a standby is= not an out-of-the-box feature and needs to be handled manually. Is that co= rrect?

Tayyab

<= br>
On Fri, 3 Oct 2025, 6:20=E2=80=AFpm Pavan Kumar, <pavan.dba27@gmail.com> wrote:
=
Hello=C2=A0 Chris,

I hope you configured required parameters=C2=A0in= PostgreSQL. I do noticed the same issue when your primary is Idle (no acti= vity).
Before doing switchover=C2=A0please perform checkpoint=C2= =A0on primary=C2=A0and run switchover command.
review=C2=A0repmgr= -f repmgr.conf cluster events , this will provide more information on what= happened=C2=A0during switchover.

Note: Make sure = repmgr daemon are running and not in pause mode before switchover=C2=A0.




On Wed, Oct 1, 2025 at 3:03= =E2=80=AFPM Fernando Hevia <fhevia@gmail.com> wrote:
=C2=A0=
In my recent experi= ence, there was no issue starting the old primary=E2=80=94it came up normal= ly. However, it resulted in a split-brain situation where the old primary c= ontinued to accept both read and write operations while still assuming the = other two nodes were replicas.

Hi Tayyab,

A split-brain is definitely=C2=A0an unexpected beha= vior. After issuing a failover or switchover command, always check the exit= code to ensure it was successful. If not, you should find in the command o= utput or in the postgresql logs an indication of what went=C2=A0wrong.

Seems that either the previous primary couldn't be= shutdown or repmgr failed somehow to change it to a standby. Repmgr sets t= he node's role by creating the standby.signal file in the data director= y. Upon startup, if Postgres finds the signal file, it will assume the stan= dby role (providing the postgresql.conf file has the correct configuration = too). I can only theorize here, but maybe repmgr failed to write the signal= file in $PGDATA either due to lack of permissions or a network failure.

The exact output would help in figuring out what wen= t wrong.

Regards,
Fernando




El mi=C3=A9, 1 oct 2025 a la(s) 4:04=E2=80=AF= p.m., Tayyab Fayyaz (tayyab.humayl@gmail.com) escribi=C3=B3:
Hello=C2=A0Fernando,

In my recent experi= ence, there was no issue starting the old primary=E2=80=94it came up normal= ly. However, it resulted in a split-brain situation where the old primary c= ontinued to accept both read and write operations while still assuming the = other two nodes were replicas.

This issue occurred with the following environment:

  • OS version: RHEL 8.10

  • Postgres DB version: 14.9

  • repmgr version: 5.5.0

Tayyab

On Wed, Oct 1, 2025 at 11:52=E2=80=AFAM Fernando Hevia <fhevia@gmail.co= m> wrote:

I h= ave 2 postgresql servers. One is the primary and another one is the standby= . I am trying to setup repmgr to do the switchover manually. Passwordless s= sh have been setup for postgres ID on both servers.

I use this comma= nd "repmgr standby switchover --log-level=3DDEBUG --verbose". The= standy database is able to promote to be the primary. For the previous pri= mary database, it was shutdown. It was not able to bring up as standby by r= epmgr.=C2=A0=C2=A0

In a switchover the prim= ary server is shutdown and restarted as a standby server after the newly pr= omoted primary (former secondary) node has been started.
If the p= rimary did not start, there must have been an issue since this is not the s= tandard behavior for a switchover command.

Have yo= u checked the Postgres log file for the previous primary? You should find t= he startup failure cause in the log.

Regards,
Fernando

=C2=A0

El mi=C3=A9, 1 oct 2025= a la(s) 7:30=E2=80=AFa.m., Chris Lee (clee.hk@gmail.com) escribi=C3=B3:=
Hi Tayyab,

Thanks for your = information . I also want to find out whether that is the default behavior,= =C2=A0 or I am not configuring repmgr correctly.
Regards,
Chris
<= br>
On Wed,= 1 Oct 2025, 18:12 Imran Khan, <imran.k.23@gmail.com> wrote:
= Hi Tayyab,

=C2=A0Is this a def= ault behavior? We have 4 nodes cluster but never had issue in switchovers.= =C2=A0

Thanks,=C2=A0
Imran

On Wed, Oct 1, 2025, 1:10=E2=80=AFPM Tayyab = Fayyaz <tayyab.humayl@gmail.com> wrote:
=
He= llo Chris,

I faced this issue = it will not add automatically as standby you have to add it manually.
=

But I wrote a script which pe= rform to add old primary as standby once it's back online.

Tayyab


On Wed, 1 Oct 2025, 3:02= =E2=80=AFpm Chris Lee, <clee.hk@gmail.com> w= rote:
Hi all,

I have 2 postgresql servers. One is the primary and= another one is the standby. I am trying to setup repmgr to do the switchov= er manually. Passwordless ssh have been setup for postgres ID on both serve= rs.

I use this command "repmgr standby switchover --log-level= =3DDEBUG --verbose". The standy database is able to promote to be the = primary. For the previous primary database, it was shutdown. It was not abl= e to bring up as standby by repmgr. =C2=A0

Does anyone encounter thi= s issue before? Thanks a lot for any suggestions.

Here is my OS and = DB versions:

OS version: CentOS Stream release 8
Postgres DB vers= ion: =C2=A015.12
rempmgr version: 5.5.0

Here is the repmgr conf f= iles:
>>>>>
node_id=3D1 =C2=A0# Use 2 on standby
no= de_name=3D'primary'
conninfo=3D'host=3Dcentos804 user=3Drepm= gr dbname=3Drepmgr password=3Dxxx connect_timeout=3D15'
use_primary_= conninfo_password=3Dtrue
data_directory=3D'/var/lib/pgsql/15/data= 9; =C2=A0# Adjust for your setup
pg_bindir=3D'/usr/pgsql-15/bin'=
service_start_command =3D 'sudo systemctl start postgresql-15'<= br>service_stop_command =C2=A0=3D 'sudo systemctl stop postgresql-15= 9;
<<<<<

>>>>>
node_id=3D2 =C2= =A0# Use 2 on standby
node_name=3D'standby'
conninfo=3D'h= ost=3Dcentos803 user=3Drepmgr dbname=3Drepmgr password=3Dxxx connect_timeou= t=3D15'
use_primary_conninfo_password=3Dtrue
data_directory=3D= 9;/var/lib/pgsql/15/data' =C2=A0# Adjust for your setup
pg_bindir=3D= '/usr/pgsql-15/bin'
service_start_command =3D 'sudo systemct= l start postgresql-15'
service_stop_command =C2=A0=3D 'sudo syst= emctl stop postgresql-15'
<<<<<

Regards,
Ch= ris


--
Regards,

#!=C2=A0 Pavan Kumar
-----------------------= -----------------------
-
Sr. Database Administrator..!

NEXT GENERATION = PROFESSIONALS, LLC
Cell =C2=A0 =C2=A0#=C2=A0 267-799-3182 = #=C2=A0 pavan.dba27 (Gtalk)=C2=A0=C2=A0
India =C2=A0 # 900045908= 3

Take Risks; if you win, you wil= l be very happy. If you lose you will be Wise =C2=A0
--0000000000005ea8dd06404167a4--