MIME-Version: 1.0
References: 
 <CACTYHzgopGRZmWeWPLXKw=McAQigLC51+mqsmyULypKhfMSJ_w@mail.gmail.com>
 <TYWP286MB263323CD07D3D6A921C0957CF2F6A@TYWP286MB2633.JPNP286.PROD.OUTLOOK.COM>
 <CACTYHzhwTMS0p6VujLzXhQrr3SHTsWcSei6BMLfceSNbajwREQ@mail.gmail.com>
In-Reply-To: 
 <CACTYHzhwTMS0p6VujLzXhQrr3SHTsWcSei6BMLfceSNbajwREQ@mail.gmail.com>
From: VASUKI M <vasukim1992002@gmail.com>
Date: Thu, 23 Oct 2025 12:43:04 +0530
Message-ID: 
 <CACTYHzhnvjY5G8ErTh3-EaFNgExAb=hng4CDy-7-0X7wYGin2w@mail.gmail.com>
Subject: Fwd: Automating Failover Resync & Re-Attach in pgpool2
To: pgpool-general@lists.postgresql.org
Cc: bharatdb@cdac.in
Content-Type: multipart/alternative; boundary="000000000000bdd0ae0641ce2887"
Archived-At: 
 <https://www.postgresql.org/message-id/CACTYHzhnvjY5G8ErTh3-EaFNgExAb%3Dhng4CDy-7-0X7wYGin2w%40mail.gmail.com>
Precedence: bulk

--000000000000bdd0ae0641ce2887
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

Hi Bo,

Thank you very much for your clarification and the helpful links on
follow_primary_command and auto_failback. I went through those sections in
the documentation, and I now understand that Pgpool-II can automatically
follow the new primary and reattach a standby node once it becomes
available again.

However, my idea was aimed at handling cases where the *old primary
diverges in timeline or LSN* after a failover =E2=80=94 for example, when t=
he new
primary executes additional writes before the old primary rejoins. In such
cases, the existing auto-failback or follow-primary mechanisms can=E2=80=99=
t
directly reattach the old node because its data is no longer in sync with
the current primary.

To address that, I was exploring a built-in *auto-resync enhancement* where
Pgpool-II could internally perform the following before reattaching:

   1.

   *Detect timeline mismatch* between the new primary and the returning
   node.
   2.

   *Automatically run pg_rewind* (or WAL-based replay) to synchronize the
   old node=E2=80=99s data directory.
   3.

   *Restart and reattach the node* to the pool automatically once the
   resync is complete.

This would essentially extend the existing auto_failback behavior to
include *automated resynchronization*, reducing manual intervention and
ensuring consistent cluster recovery even in timeline divergence scenarios.

I=E2=80=99m thinking of something like a new configuration section in pgpoo=
l.conf:

auto_resync =3D on
resync_method =3D 'pg_rewind'
resync_user =3D 'replicator'

The feature could hook into the existing failback workflow (perhaps in
failover.c or recovery.c), so that Pgpool performs resync + reattach
seamlessly when the failed node returns.

Would this be something the Pgpool team would consider as an enhancement?

Thanks again for your time and guidance.

Best regards,
*Vasuki M*
CDAC, Chennai
vasukim1992002@gmail.com

On Fri, 17 Oct 2025 at 13:30, Bo Peng <pengbo@sraoss.co.jp> wrote:

> Hi,
>
> Thank you for your question.
>
> > While working with PostgreSQL failover scenarios, I noticed that the
> process of re-attaching a standby node
> > after a failover can be somewhat manual and prone to delays, especially
> in production environments.
>
> After a failover, the standby nodes can be automatically attached to the
> new primary by setting "follow_primary_command".
>
>
> https://www.pgpool.net/docs/latest/en/html/runtime-config-failover.html#R=
UNTIME-CONFIG-FAILOVER-SETTINGS
>
> You can also automatically reattach a failed standby node by setting
> "auto_failback =3D on".
>
>
> https://www.pgpool.net/docs/latest/ja/html/runtime-config-failover.html#G=
UC-AUTO-FAILBACK
>
> ---
> Bo Peng <pengbo@sraoss.co.jp>
> SRA OSS K.K.
> TEL: 03-5979-2701 FAX: 03-5979-2702
> Mobile: 080-7752-0749
> URL: https://www.sraoss.co.jp/
>
>
> ________________________________________
> =E5=B7=AE=E5=87=BA=E4=BA=BA: VASUKI M <vasukim1992002@gmail.com>
> =E9=80=81=E4=BF=A1: 2025 =E5=B9=B4 10 =E6=9C=88 10 =E6=97=A5 (=E9=87=91=
=E6=9B=9C=E6=97=A5) 21:17
> =E5=AE=9B=E5=85=88: pgsql-bugs@lists.postgresql.org <pgsql-bugs@lists.pos=
tgresql.org>
> Cc: bharatdb@cdac.in <bharatdb@cdac.in>;
> pgpool-general@lists.postgresql.org <pgpool-general@lists.postgresql.org>
> =E4=BB=B6=E5=90=8D: Automating Failover Resync & Re-Attach in pgpool2
>
> Dear PostgreSQL and Pgpool Communities,While working with PostgreSQL
> failover scenarios, I noticed that the process of re-attaching a standby
> node after a failover can be somewhat manual and prone to delays,
> especially in production environments.I explored automating this process
> using a combination of pg_rewind and WAL replay, which allows a standby
> node to resynchronize and re-attach to the primary automatically after a
> failover. This could reduce downtime and simplify management of failover
> nodes in high-availability setups.Automatically resynchronize after
> failoverReduce downtime and ensure quicker recoveryMinimize manual
> operations and errorsMaintain consistent cluster state with less
> administrative overheadI believe that integrating such an automated resyn=
c
> and re-attach feature into Pgpool-II could be very valuable for PostgreSQ=
L
> users, potentially as an enhancement in a future release.I wanted to shar=
e
> this idea with the community to get feedback, suggestions, or any pointer=
s
> on existing work that may align with this. I am happy to contribute more
> details

--000000000000bdd0ae0641ce2887
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div class=3D"gmail_quote gmail_quote_container"><div dir=
=3D"ltr"><p>Hi Bo,</p>
<p>Thank you very much for your clarification and the helpful links on <cod=
e>follow_primary_command</code> and <code>auto_failback</code>. I went thro=
ugh those sections in the documentation, and I now understand that Pgpool-I=
I can automatically follow the new primary and reattach a standby node once=
 it becomes available again.</p>
<p>However, my idea was aimed at handling cases where the <strong>old prima=
ry diverges in timeline or LSN</strong> after a failover =E2=80=94 for exam=
ple, when the new primary executes additional writes before the old primary=
 rejoins. In such cases, the existing auto-failback or follow-primary mecha=
nisms can=E2=80=99t directly reattach the old node because its data is no l=
onger in sync with the current primary.</p>
<p>To address that, I was exploring a built-in <strong>auto-resync enhancem=
ent</strong> where Pgpool-II could internally perform the following before =
reattaching:</p>
<ol>
<li>
<p><strong>Detect timeline mismatch</strong> between the new primary and th=
e returning node.</p>
</li>
<li>
<p><strong>Automatically run <code>pg_rewind</code></strong> (or WAL-based =
replay) to synchronize the old node=E2=80=99s data directory.</p>
</li>
<li>
<p><strong>Restart and reattach the node</strong> to the pool automatically=
 once the resync is complete.</p>
</li>
</ol>
<p>This would essentially extend the existing <code>auto_failback</code> be=
havior to include <strong>automated resynchronization</strong>, reducing ma=
nual intervention and ensuring consistent cluster recovery even in timeline=
 divergence scenarios.</p>
<p>I=E2=80=99m thinking of something like a new configuration section in <c=
ode>pgpool.conf</code>:</p>
<pre><code>auto_resync =3D on
resync_method =3D &#39;pg_rewind&#39;
resync_user =3D &#39;replicator&#39;
</code></pre>
<p>The feature could hook into the existing failback workflow (perhaps in <=
code>failover.c</code> or <code>recovery.c</code>), so that Pgpool performs=
 resync + reattach seamlessly when the failed node returns.</p>
<p>Would this be something the Pgpool team would consider as an enhancement=
?</p>
<p>Thanks again for your time and guidance.</p>
<p>Best regards,<br>
<strong>Vasuki M</strong><br>
CDAC, Chennai<br>
<a href=3D"mailto:vasukim1992002@gmail.com" target=3D"_blank">vasukim199200=
2@gmail.com</a></p></div><br><div class=3D"gmail_quote"><div dir=3D"ltr" cl=
ass=3D"gmail_attr">On Fri, 17 Oct 2025 at 13:30, Bo Peng &lt;<a href=3D"mai=
lto:pengbo@sraoss.co.jp" target=3D"_blank">pengbo@sraoss.co.jp</a>&gt; wrot=
e:<br></div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0=
.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi,<br>
<br>
Thank you for your question.<br>
<br>
&gt; While working with PostgreSQL failover scenarios, I noticed that the p=
rocess of re-attaching a standby node<br>
&gt; after a failover can be somewhat manual and prone to delays, especiall=
y in production environments.<br>
<br>
After a failover, the standby nodes can be automatically attached to the ne=
w primary by setting &quot;follow_primary_command&quot;.<br>
<br>
=C2=A0 =C2=A0 <a href=3D"https://www.pgpool.net/docs/latest/en/html/runtime=
-config-failover.html#RUNTIME-CONFIG-FAILOVER-SETTINGS" rel=3D"noreferrer" =
target=3D"_blank">https://www.pgpool.net/docs/latest/en/html/runtime-config=
-failover.html#RUNTIME-CONFIG-FAILOVER-SETTINGS</a><br>
<br>
You can also automatically reattach a failed standby node by setting &quot;=
auto_failback =3D on&quot;.<br>
<br>
=C2=A0 =C2=A0 <a href=3D"https://www.pgpool.net/docs/latest/ja/html/runtime=
-config-failover.html#GUC-AUTO-FAILBACK" rel=3D"noreferrer" target=3D"_blan=
k">https://www.pgpool.net/docs/latest/ja/html/runtime-config-failover.html#=
GUC-AUTO-FAILBACK</a><br>
<br>
---<br>
Bo Peng &lt;<a href=3D"mailto:pengbo@sraoss.co.jp" target=3D"_blank">pengbo=
@sraoss.co.jp</a>&gt;<br>
SRA OSS K.K.<br>
TEL: 03-5979-2701 FAX: 03-5979-2702<br>
Mobile: 080-7752-0749<br>
URL: <a href=3D"https://www.sraoss.co.jp/" rel=3D"noreferrer" target=3D"_bl=
ank">https://www.sraoss.co.jp/</a><br>
<br>
<br>
________________________________________<br>
=E5=B7=AE=E5=87=BA=E4=BA=BA: VASUKI M &lt;<a href=3D"mailto:vasukim1992002@=
gmail.com" target=3D"_blank">vasukim1992002@gmail.com</a>&gt;<br>
=E9=80=81=E4=BF=A1: 2025 =E5=B9=B4 10 =E6=9C=88 10 =E6=97=A5 (=E9=87=91=E6=
=9B=9C=E6=97=A5) 21:17<br>
=E5=AE=9B=E5=85=88: <a href=3D"mailto:pgsql-bugs@lists.postgresql.org" targ=
et=3D"_blank">pgsql-bugs@lists.postgresql.org</a> &lt;<a href=3D"mailto:pgs=
ql-bugs@lists.postgresql.org" target=3D"_blank">pgsql-bugs@lists.postgresql=
.org</a>&gt;<br>
Cc: <a href=3D"mailto:bharatdb@cdac.in" target=3D"_blank">bharatdb@cdac.in<=
/a> &lt;<a href=3D"mailto:bharatdb@cdac.in" target=3D"_blank">bharatdb@cdac=
.in</a>&gt;; <a href=3D"mailto:pgpool-general@lists.postgresql.org" target=
=3D"_blank">pgpool-general@lists.postgresql.org</a> &lt;<a href=3D"mailto:p=
gpool-general@lists.postgresql.org" target=3D"_blank">pgpool-general@lists.=
postgresql.org</a>&gt;<br>
=E4=BB=B6=E5=90=8D: Automating Failover Resync &amp; Re-Attach in pgpool2<b=
r>
<br>
Dear PostgreSQL and Pgpool Communities,While working with PostgreSQL failov=
er scenarios, I noticed that the process of re-attaching a standby node aft=
er a failover can be somewhat manual and prone to delays, especially in pro=
duction environments.I explored automating this process using a combination=
 of pg_rewind and WAL replay, which allows a standby node to resynchronize =
and re-attach to the primary automatically after a failover. This could red=
uce downtime and simplify management of failover nodes in high-availability=
 setups.Automatically resynchronize after failoverReduce downtime and ensur=
e quicker recoveryMinimize manual operations and errorsMaintain consistent =
cluster state with less administrative overheadI believe that integrating s=
uch an automated resync and re-attach feature into Pgpool-II could be very =
valuable for PostgreSQL users, potentially as an enhancement in a future re=
lease.I wanted to share this idea with the community to get feedback, sugge=
stions, or any pointers on existing work that may align with this. I am hap=
py to contribute more details</blockquote></div>
</div></div>

--000000000000bdd0ae0641ce2887--