MIME-Version: 1.0
References: 
 <CAB5fag625D2Y7xE0QzzPPs5FQ8SCYNNHN5CSvD7XA-e_uDAOTg@mail.gmail.com>
In-Reply-To: 
 <CAB5fag625D2Y7xE0QzzPPs5FQ8SCYNNHN5CSvD7XA-e_uDAOTg@mail.gmail.com>
From: Asad Ali <asadalinagri@gmail.com>
Date: Thu, 19 Sep 2024 11:00:26 +0500
Message-ID: 
 <CAJ9xe=sr1j59XpXpwHpU4Ez12L=WHOwG=Vj1YH9WyBPEy8qWzQ@mail.gmail.com>
Subject: Re: Automatic failback
To: Wasim Devale <wasimd60@gmail.com>
Cc: Pgsql-admin <pgsql-admin@lists.postgresql.org>,
	pgsql-admin <pgsql-admin@postgresql.org>
Content-Type: multipart/alternative; boundary="0000000000006d5c00062272a434"
Archived-At: 
 <https://www.postgresql.org/message-id/CAJ9xe%3Dsr1j59XpXpwHpU4Ez12L%3DWHOwG%3DVj1YH9WyBPEy8qWzQ%40mail.gmail.com>
Precedence: bulk

--0000000000006d5c00062272a434
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

Hi Wasim,

To achieve automatic failback with minimal or zero downtime during disaster
recovery (DR) using *Barman* and PostgreSQL in your Azure setup, here=E2=80=
=99s a
high-level architecture and strategy you can follow:

1. Set up   Barman is in the Azure West Region to back up the PostgreSQL
database from the Azure East Region. Use streaming replication to keep the
DR database up-to-date with the primary database.

   - *Primary Database:* Configure continuous WAL streaming to the standby
   in the West region.

              (archive_mode =3D on, archive_command =3D 'barman-wal-archive=
').

   - *Standby Database:* Configure this as a hot standby (read-only), ready
   to be promoted in case of failover.  Configure it to receive WAL data vi=
a
   streaming replication.

2. Implement an automatic failover mechanism using a tool like *Patroni* or
*pg_auto_failover*. These tools monitor the primary database and, in case
of failure, automatically promote the standby database to the primary role.

   - *Patroni*: A cluster manager for PostgreSQL with high availability,
   automatically promoting a standby to primary when a failure is detected.
   - *pg_auto_failover*: Another option that provides automatic failover
   between primary and standby PostgreSQL databases, making sure the standb=
y
   can seamlessly take over.

3. After recovery, once the primary database in the east region becomes
available again, you need to set up *automatic failback*. Here=E2=80=99s ho=
w you
can handle failback:

   -

   *Step 1: Re-establish Streaming Replication*: After promoting the DR
   database in the west region, reconfigure the primary in the east region =
as
   a standby. This can be done by setting up streaming replication from the
   promoted DR database (west) back to the original primary (east).
   - Reconfigure the old primary to become a replica of the new primary
      (which is the DR site in the west).
      - Barman can assist with this by restoring the latest backup and
      setting up WAL streaming to the original region.
   -

   *Step 2: Reverse the Failover (Failback)*: Once the original region is
   stable, you can reverse the failover with zero downtime:
   - Stop write operations on the current primary (west).
      - Perform a controlled failover back to the original primary in the
      east, making it the new primary.
      - Reconfigure the DR site in the west region to again become a
      standby replica.

   This can be automated using *Patroni* or *pg_auto_failover*, ensuring
   seamless transitions between primary and standby without user interventi=
on

4. To further minimize downtime during failback, you can use *logical
replication*:

   - After failover, set up logical replication from the new primary (west)
   to the original primary (east) while the original primary is still
   functioning as a read-only standby.
   - Once logical replication has caught up, you can promote the original
   primary (east) with virtually no downtime, ensuring seamless failback.

This will ensure that your database is always available and that there is
no downtime during a failover.

Let me know if you have any other questions.

Best regards,
Asad Ali

On Wed, Sep 18, 2024 at 5:17=E2=80=AFPM Wasim Devale <wasimd60@gmail.com> w=
rote:

> Hi All
>
> I have barman tool in place and can any one suggest automatic failback
> with zero down time.
>
> My PG database is hosted on Linux Red Hat 9. Our all Azure resources are
> on east region. We are planning to do DR disaster recovery in west region=
.
>
> Thanks,
> Wasim
>

--0000000000006d5c00062272a434
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div></div><div>Hi Wasim,</div><div><br></div><div>To achi=
eve automatic failback with minimal or zero downtime during disaster recove=
ry (DR) using <strong>Barman</strong> and PostgreSQL in your Azure setup, h=
ere=E2=80=99s a high-level architecture and strategy you can follow:</div><=
div><br></div>1. Set up=C2=A0

=C2=A0Barman is in the Azure West Region to back up the PostgreSQL database=
 from the Azure East Region. Use streaming replication to keep the DR datab=
ase up-to-date with the primary database.<div><ul><li><b>Primary Database:<=
/b> Configure continuous WAL streaming to the standby in the West region.</=
li></ul>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 (archive_mode =3D =
on, archive_command =3D &#39;barman-wal-archive&#39;).<br><ul><li><b>Standb=
y Database:</b> Configure this as a hot standby (read-only), ready to be pr=
omoted in case of failover.=C2=A0 Configure it to receive WAL data via stre=
aming replication.</li></ul></div><div><div>2. Implement an automatic failo=
ver mechanism using a tool like <strong>Patroni</strong> or <strong>pg_auto=
_failover</strong>. These tools monitor the primary database and, in case o=
f failure, automatically promote the standby database to the primary role.<=
/div><ul><li><strong>Patroni</strong>: A cluster manager for PostgreSQL wit=
h high availability, automatically promoting a standby to primary when a fa=
ilure is detected.</li><li><strong>pg_auto_failover</strong>: Another optio=
n that provides automatic failover between primary and standby PostgreSQL d=
atabases, making sure the standby can seamlessly take over.</li></ul><div>3=
. After recovery, once the primary database in the east region becomes avai=
lable again, you need to set up <strong>automatic failback</strong>. Here=
=E2=80=99s how you can handle failback:</div><ul><li><p><strong>Step 1: Re-=
establish Streaming Replication</strong>:
After promoting the DR database in the west region, reconfigure the primary=
 in the east region as a standby. This can be done by setting up streaming =
replication from the promoted DR database (west) back to the original prima=
ry (east).</p><ul><li>Reconfigure the old primary to become a replica of th=
e new primary (which is the DR site in the west).</li><li>Barman can assist=
 with this by restoring the latest backup and setting up WAL streaming to t=
he original region.</li></ul></li><li><p><strong>Step 2: Reverse the Failov=
er (Failback)</strong>:
Once the original region is stable, you can reverse the failover with zero =
downtime:</p><ul><li>Stop write operations on the current primary (west).</=
li><li>Perform a controlled failover back to the original primary in the ea=
st, making it the new primary.</li><li>Reconfigure the DR site in the west =
region to again become a standby replica.</li></ul><p>This can be automated=
 using <strong>Patroni</strong> or <strong>pg_auto_failover</strong>, ensur=
ing seamless transitions between primary and standby without user intervent=
ion</p></li></ul><h4><span style=3D"font-weight:normal">4.</span>=C2=A0<spa=
n style=3D"font-weight:normal">To further minimize downtime during failback=
, you can use </span><strong>logical replication</strong><span style=3D"fon=
t-weight:normal">:</span><br></h4><ul><li>After failover, set up logical re=
plication from the new primary (west) to the original primary (east) while =
the original primary is still functioning as a read-only standby.</li><li>O=
nce logical replication has caught up, you can promote the original primary=
 (east) with virtually no downtime, ensuring seamless failback.</li></ul><d=
iv>This will ensure that your database is always available and that there i=
s no downtime during a failover.</div><div><br></div><div>Let me know if yo=
u have any other questions.</div><div><br></div><div>Best regards,</div><di=
v>Asad Ali</div></div></div><br><div class=3D"gmail_quote"><div dir=3D"ltr"=
 class=3D"gmail_attr">On Wed, Sep 18, 2024 at 5:17=E2=80=AFPM Wasim Devale =
&lt;<a href=3D"mailto:wasimd60@gmail.com" target=3D"_blank">wasimd60@gmail.=
com</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"marg=
in:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1e=
x"><div dir=3D"auto">Hi All<div dir=3D"auto"><br></div><div dir=3D"auto">I =
have barman tool in place and can any one suggest automatic failback with z=
ero down time.</div><div dir=3D"auto"><br></div><div dir=3D"auto">My PG dat=
abase is hosted on Linux Red Hat 9. Our all Azure resources are on east reg=
ion. We are planning to do DR disaster recovery in west region.</div><div d=
ir=3D"auto"><br></div><div dir=3D"auto">Thanks,</div><div dir=3D"auto">Wasi=
m</div></div>
</blockquote></div>

--0000000000006d5c00062272a434--