Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1v4fi8-0045ac-0V for pgsql-admin@arkaria.postgresql.org; Fri, 03 Oct 2025 13:20:36 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1v4fi6-00DLWB-2D for pgsql-admin@arkaria.postgresql.org; Fri, 03 Oct 2025 13:20:34 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1v4fi5-00DLW3-Hv for pgsql-admin@lists.postgresql.org; Fri, 03 Oct 2025 13:20:34 +0000 Received: from mail-ua1-x934.google.com ([2607:f8b0:4864:20::934]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1v4fi3-000Cau-2E for pgsql-admin@postgresql.org; Fri, 03 Oct 2025 13:20:33 +0000 Received: by mail-ua1-x934.google.com with SMTP id a1e0cc1a2514c-905b08b09f5so831051241.3 for ; Fri, 03 Oct 2025 06:20:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1759497629; x=1760102429; darn=postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=tFOjAkEgIqQxm8RfNA/4941s5dOuQdvtHslQssxlOlU=; b=OFOcCSPJ8K1pfBF+Nv2n4qmtZGENSACUvDjG7+nuDXiYPMhi1K9d/cy3C6lCGlNSIG 08XJy6s65RkLYWkXt4wv7+1ji6mgJvxIa9b7RdwDHsxzYAGaeyIvNlQKnnsYTMtlqLzI Z39nvAlQ8yN4w96FBI3JM2UhVhX5FyyiJDG2KuQiAEm7nSyr5fLlJnI1H+7YxibzYThn stKQlQwrCERs5mzPNWD7aWJNTs+0TiXtCYZEO/l0APlF6/ql0E3SefdInr+yecCIAxXh bYXa4AGkOrYo8Fp7m4AMB6V7D8LFzRvT6kFQKNmiDQ83k5aGRnNhfjgqF/igC8BMn5nR UixA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759497629; x=1760102429; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=tFOjAkEgIqQxm8RfNA/4941s5dOuQdvtHslQssxlOlU=; b=E6OlYaxBQE4RZAbTRd+KjsRILiy2iuMAGNF37bles2FPNZ2uybu3ytyUpImYzP4Dx1 hNYQhuUQUgmT0alSmI8e1tXytaO+84yzWyHRVYbLpD03X8W5I1WC/pybRnRglKhdyEyK JnnoZSSoa9VMsgxHQSnOH5gLNVIn26YTM20wzfiq96cGlQseuhkDq/1edXPU2PzFm3aN H6vy/nJl/2StWsCYB2jFKWTvl9+Htfi6tQIjervebLfYDUlF4DvWOnETSSm7i7W8aVA/ OwwXtzMPA4EtmN4/hjWBN11ba3YqLrXzTSBgAXMjfrtL0l9aRrJY9RoP37ceCuGkjRko 0oGw== X-Forwarded-Encrypted: i=1; AJvYcCVsM6OgCDivmmczhxl9hSMqZs/NFdCk81LkIkFv3SpSQqWFmtDik/xyjl40UiBnByMD6/Grq6iNxzozow==@postgresql.org X-Gm-Message-State: AOJu0YzHEPWYsyAMv17QUIxq7knXxBBKi5i1l2sDX9n5wfvNxGWiPwv1 V4IUb59VPynnD73Tfh6YLfYS+j7aiRmhBZakf90XYO1qdagle+FCiz3O7oOnvJ4j/vUNQcuMWGk j59FUc1ognFw4qfj3/XK+BfuQK4ltocw= X-Gm-Gg: ASbGnct+OoitD/XugqHpVy9pwXCJ5zGjc5YG7YlQ067oes2Qg4V85jEB4eJ07xDNM5y 0T4VTfgzb7tooH9socpdGdD40PVXbOtw+/HdGEPPjPwskaS6H2lfZAVAd1TUHnln5G9LewCjWtQ WhwXHQ9+w+1x/ZTXjBxzinFJxqwokcNGH2rZWBpxSlIF7c8dP4gXj3XOXVIubeAdsf2Z3zHfqFl Biz/T6umBXNWZJ/gyoeQDU0vDRmzkQ9l03pdYhng1k= X-Google-Smtp-Source: AGHT+IEe+fMnmJouZswDi+Gn/XiV4U6MO4lFvfFZHUrk08mK7x3hhhN9zxlaX+r2V4a3+4QnvqMqBu4xx2Ik3XTBqMg= X-Received: by 2002:a05:6102:1611:b0:529:4887:9f05 with SMTP id ada2fe7eead31-5d41d108b96mr964929137.27.1759497629395; Fri, 03 Oct 2025 06:20:29 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Pavan Kumar Date: Fri, 3 Oct 2025 08:19:50 -0500 X-Gm-Features: AS18NWDtDefp7or64xuwzx7KEfimd5I4vO6PXvi0it2opuCTN7p0Og45dpZOJVM Message-ID: Subject: Re: repmgr cannot bring up the standby database after switchover manaully To: Fernando Hevia Cc: Tayyab Fayyaz , Chris Lee , Imran Khan , pgsql-admin Content-Type: multipart/alternative; boundary="0000000000002eef3c064040f7ba" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --0000000000002eef3c064040f7ba Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hello Chris, I hope you configured required parameters in PostgreSQL. I do noticed the same issue when your primary is Idle (no activity). Before doing switchover please perform checkpoint on primary and run switchover command. review repmgr -f repmgr.conf cluster events , this will provide more information on what happened during switchover. Note: Make sure repmgr daemon are running and not in pause mode before switchover . On Wed, Oct 1, 2025 at 3:03=E2=80=AFPM Fernando Hevia wr= ote: > > >> In my recent experience, there was no issue starting the old primary=E2= =80=94it >> came up normally. However, it resulted in a split-brain situation where = the >> old primary continued to accept both read and write operations while sti= ll >> assuming the other two nodes were replicas. > > > Hi Tayyab, > > A split-brain is definitely an unexpected behavior. After issuing a > failover or switchover command, always check the exit code to ensure it w= as > successful. If not, you should find in the command output or in the > postgresql logs an indication of what went wrong. > > Seems that either the previous primary couldn't be shutdown or repmgr > failed somehow to change it to a standby. Repmgr sets the node's role by > creating the standby.signal file in the data directory. Upon startup, if > Postgres finds the signal file, it will assume the standby role (providin= g > the postgresql.conf file has the correct configuration too). I can only > theorize here, but maybe repmgr failed to write the signal file in $PGDAT= A > either due to lack of permissions or a network failure. > > The exact output would help in figuring out what went wrong. > > Regards, > Fernando > > > > > > El mi=C3=A9, 1 oct 2025 a la(s) 4:04=E2=80=AFp.m., Tayyab Fayyaz ( > tayyab.humayl@gmail.com) escribi=C3=B3: > >> Hello Fernando, >> >> In my recent experience, there was no issue starting the old primary=E2= =80=94it >> came up normally. However, it resulted in a split-brain situation where = the >> old primary continued to accept both read and write operations while sti= ll >> assuming the other two nodes were replicas. >> >> This issue occurred with the following environment: >> >> - >> >> *OS version:* RHEL 8.10 >> - >> >> *Postgres DB version:* 14.9 >> - >> >> *repmgr version:* 5.5.0 >> >> Tayyab >> >> On Wed, Oct 1, 2025 at 11:52=E2=80=AFAM Fernando Hevia wrote: >> >>> >>> I have 2 postgresql servers. One is the primary and another one is the >>>> standby. I am trying to setup repmgr to do the switchover manually. >>>> Passwordless ssh have been setup for postgres ID on both servers. >>>> >>>> I use this command "repmgr standby switchover --log-level=3DDEBUG >>>> --verbose". The standy database is able to promote to be the primary. = For >>>> the previous primary database, it was shutdown. It was not able to bri= ng up >>>> as standby by repmgr. >>> >>> >>> In a switchover the primary server is shutdown and restarted as a >>> standby server after the newly promoted primary (former secondary) node= has >>> been started. >>> If the primary did not start, there must have been an issue since this >>> is not the standard behavior for a switchover command. >>> >>> Have you checked the Postgres log file for the previous primary? You >>> should find the startup failure cause in the log. >>> >>> Regards, >>> Fernando >>> >>> >>> >>> El mi=C3=A9, 1 oct 2025 a la(s) 7:30=E2=80=AFa.m., Chris Lee (clee.hk@g= mail.com) >>> escribi=C3=B3: >>> >>>> Hi Tayyab, >>>> >>>> Thanks for your information . I also want to find out whether that is >>>> the default behavior, or I am not configuring repmgr correctly. >>>> >>>> Regards, >>>> Chris >>>> >>>> On Wed, 1 Oct 2025, 18:12 Imran Khan, wrote: >>>> >>>>> Hi Tayyab, >>>>> >>>>> Is this a default behavior? We have 4 nodes cluster but never had >>>>> issue in switchovers. >>>>> >>>>> Thanks, >>>>> Imran >>>>> >>>>> On Wed, Oct 1, 2025, 1:10=E2=80=AFPM Tayyab Fayyaz >>>>> wrote: >>>>> >>>>>> Hello Chris, >>>>>> >>>>>> I faced this issue it will not add automatically as standby you have >>>>>> to add it manually. >>>>>> >>>>>> But I wrote a script which perform to add old primary as standby onc= e >>>>>> it's back online. >>>>>> >>>>>> Tayyab >>>>>> >>>>>> >>>>>> On Wed, 1 Oct 2025, 3:02=E2=80=AFpm Chris Lee, w= rote: >>>>>> >>>>>>> Hi all, >>>>>>> >>>>>>> I have 2 postgresql servers. One is the primary and another one is >>>>>>> the standby. I am trying to setup repmgr to do the switchover manua= lly. >>>>>>> Passwordless ssh have been setup for postgres ID on both servers. >>>>>>> >>>>>>> I use this command "repmgr standby switchover --log-level=3DDEBUG >>>>>>> --verbose". The standy database is able to promote to be the primar= y. For >>>>>>> the previous primary database, it was shutdown. It was not able to = bring up >>>>>>> as standby by repmgr. >>>>>>> >>>>>>> Does anyone encounter this issue before? Thanks a lot for any >>>>>>> suggestions. >>>>>>> >>>>>>> Here is my OS and DB versions: >>>>>>> >>>>>>> OS version: CentOS Stream release 8 >>>>>>> Postgres DB version: 15.12 >>>>>>> rempmgr version: 5.5.0 >>>>>>> >>>>>>> Here is the repmgr conf files: >>>>>>> >>>>> >>>>>>> node_id=3D1 # Use 2 on standby >>>>>>> node_name=3D'primary' >>>>>>> conninfo=3D'host=3Dcentos804 user=3Drepmgr dbname=3Drepmgr password= =3Dxxx >>>>>>> connect_timeout=3D15' >>>>>>> use_primary_conninfo_password=3Dtrue >>>>>>> data_directory=3D'/var/lib/pgsql/15/data' # Adjust for your setup >>>>>>> pg_bindir=3D'/usr/pgsql-15/bin' >>>>>>> service_start_command =3D 'sudo systemctl start postgresql-15' >>>>>>> service_stop_command =3D 'sudo systemctl stop postgresql-15' >>>>>>> <<<<< >>>>>>> >>>>>>> >>>>> >>>>>>> node_id=3D2 # Use 2 on standby >>>>>>> node_name=3D'standby' >>>>>>> conninfo=3D'host=3Dcentos803 user=3Drepmgr dbname=3Drepmgr password= =3Dxxx >>>>>>> connect_timeout=3D15' >>>>>>> use_primary_conninfo_password=3Dtrue >>>>>>> data_directory=3D'/var/lib/pgsql/15/data' # Adjust for your setup >>>>>>> pg_bindir=3D'/usr/pgsql-15/bin' >>>>>>> service_start_command =3D 'sudo systemctl start postgresql-15' >>>>>>> service_stop_command =3D 'sudo systemctl stop postgresql-15' >>>>>>> <<<<< >>>>>>> >>>>>>> Regards, >>>>>>> Chris >>>>>>> >>>>>> --=20 *Regards,#! Pavan Kumar----------------------------------------------*- *Sr. Database Administrator..!* *NEXT GENERATION PROFESSIONALS, LLC* *Cell # 267-799-3182 # pavan.dba27 (Gtalk) * *India # 9000459083* *Take Risks; if you win, you will be very happy. If you lose you will be Wise * --0000000000002eef3c064040f7ba Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hello=C2=A0 Chris,

I hope you configured required parameters=C2=A0in= PostgreSQL. I do noticed the same issue when your primary is Idle (no acti= vity).
Before doing switchover=C2=A0please perform checkpoint=C2= =A0on primary=C2=A0and run switchover command.
review=C2=A0repmgr= -f repmgr.conf cluster events , this will provide more information on what= happened=C2=A0during switchover.

Note: Make sure = repmgr daemon are running and not in pause mode before switchover=C2=A0.




On We= d, Oct 1, 2025 at 3:03=E2=80=AFPM Fernando Hevia <fhevia@gmail.com> wrote:
=C2=A0
In my recent experience, there was n= o issue starting the old primary=E2=80=94it came up normally. However, it r= esulted in a split-brain situation where the old primary continued to accep= t both read and write operations while still assuming the other two nodes w= ere replicas.

Hi Tayyab,

A split-brain is definitely=C2=A0an unexpected behavior. After issui= ng a failover or switchover command, always check the exit code to ensure i= t was successful. If not, you should find in the command output or in the p= ostgresql logs an indication of what went=C2=A0wrong.

<= div>Seems that either the previous primary couldn't be shutdown or repm= gr failed somehow to change it to a standby. Repmgr sets the node's rol= e by creating the standby.signal file in the data directory. Upon startup, = if Postgres finds the signal file, it will assume the standby role (providi= ng the postgresql.conf file has the correct configuration too). I can only = theorize here, but maybe repmgr failed to write the signal file in $PGDATA = either due to lack of permissions or a network failure.

The exact output would help in figuring out what went wrong.

Regards,
Fernando





El mi=C3=A9, 1 oct 2025 a la(s) 4:04=E2=80=AFp.m., Tayyab Fayy= az (tayyab.hum= ayl@gmail.com) escribi=C3=B3:
Hello=C2=A0Fernando,

In my recent experience, there was no issue starting th= e old primary=E2=80=94it came up normally. However, it resulted in a split-= brain situation where the old primary continued to accept both read and wri= te operations while still assuming the other two nodes were replicas.

This issue occurred with the following environment:

  • OS version: RHEL 8.10

  • Postgres DB version: 14.9

  • repmgr version: 5.5.0

Tayyab

On Wed, Oct 1, 2025 at 11:52=E2=80=AFAM Fernando Hevia <fhevia@gmail.com> wrote:
<= br>
I have 2 postgresql se= rvers. One is the primary and another one is the standby. I am trying to se= tup repmgr to do the switchover manually. Passwordless ssh have been setup = for postgres ID on both servers.

I use this command "repmgr sta= ndby switchover --log-level=3DDEBUG --verbose". The standy database is= able to promote to be the primary. For the previous primary database, it w= as shutdown. It was not able to bring up as standby by repmgr.=C2=A0=C2=A0<= /blockquote>

In a switchover the primary server is shutd= own and restarted as a standby server after the newly promoted primary (for= mer secondary) node has been started.
If the primary did not star= t, there must have been an issue since this is not the standard behavior fo= r a switchover command.

Have you checked the Postg= res log file for the previous primary? You should find the startup failure = cause in the log.

Regards,
Fernando

=C2=A0

El mi=C3=A9, 1 oct 2025 a la(s) 7:30=E2=80= =AFa.m., Chris Lee (= clee.hk@gmail.com) escribi=C3=B3:
Hi Tayyab,

Thanks for your information . I also want to find out = whether that is the default behavior,=C2=A0 or I am not configuring repmgr = correctly.

Regards,
Chris

On Wed, 1 Oct 2025, 18:12 Imran Khan, <imran.k.23@gmail.com= > wrote:
=
Hi Tayyab,

= =C2=A0Is this a default behavior? We have 4 nodes cluster but never had iss= ue in switchovers.=C2=A0

Thanks,=C2=A0
Imran

On Wed, Oct 1, 2025, 1:10= =E2=80=AFPM Tayyab Fayyaz <tayyab.humayl@gmail.com> wrote:
Hello Chris,

I faced thi= s issue it will not add automatically as standby you have to add it manuall= y.

But I wrote a script = which perform to add old primary as standby once it's back online.

Tayyab


On Wed, 1 Oct 2025= , 3:02=E2=80=AFpm Chris Lee, <clee.hk@gmail.com> wrot= e:
Hi all,

I have 2 postgresql servers. One is the primary and anot= her one is the standby. I am trying to setup repmgr to do the switchover ma= nually. Passwordless ssh have been setup for postgres ID on both servers.
I use this command "repmgr standby switchover --log-level=3DDEBU= G --verbose". The standy database is able to promote to be the primary= . For the previous primary database, it was shutdown. It was not able to br= ing up as standby by repmgr. =C2=A0

Does anyone encounter this issue= before? Thanks a lot for any suggestions.

Here is my OS and DB vers= ions:

OS version: CentOS Stream release 8
Postgres DB version: = =C2=A015.12
rempmgr version: 5.5.0

Here is the repmgr conf files:=
>>>>>
node_id=3D1 =C2=A0# Use 2 on standby
node_na= me=3D'primary'
conninfo=3D'host=3Dcentos804 user=3Drepmgr db= name=3Drepmgr password=3Dxxx connect_timeout=3D15'
use_primary_conni= nfo_password=3Dtrue
data_directory=3D'/var/lib/pgsql/15/data' = =C2=A0# Adjust for your setup
pg_bindir=3D'/usr/pgsql-15/bin'service_start_command =3D 'sudo systemctl start postgresql-15'
= service_stop_command =C2=A0=3D 'sudo systemctl stop postgresql-15'<= br><<<<<

>>>>>
node_id=3D2 =C2=A0# = Use 2 on standby
node_name=3D'standby'
conninfo=3D'host= =3Dcentos803 user=3Drepmgr dbname=3Drepmgr password=3Dxxx connect_timeout= =3D15'
use_primary_conninfo_password=3Dtrue
data_directory=3D'= ;/var/lib/pgsql/15/data' =C2=A0# Adjust for your setup
pg_bindir=3D&= #39;/usr/pgsql-15/bin'
service_start_command =3D 'sudo systemctl= start postgresql-15'
service_stop_command =C2=A0=3D 'sudo syste= mctl stop postgresql-15'
<<<<<

Regards,
Chr= is


--
Regards,

#!=C2=A0 Pavan Kumar
-----------------------= -----------------------
-
Sr. Database Administrator..!

NEXT GENERATION = PROFESSIONALS, LLC
Cell =C2=A0 =C2=A0#=C2=A0 267-799-3182 = #=C2=A0 pavan.dba27 (Gtalk)=C2=A0=C2=A0
India =C2=A0 # 900045908= 3

Take Risks; if you win, you wil= l be very happy. If you lose you will be Wise =C2=A0
--0000000000002eef3c064040f7ba--