Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1ulkJb-004w9s-2q for pgsql-general@arkaria.postgresql.org; Tue, 12 Aug 2025 08:25:03 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1ulkJY-0064ZI-Su for pgsql-general@arkaria.postgresql.org; Tue, 12 Aug 2025 08:25:01 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1ulkJY-0064ZA-H1 for pgsql-general@lists.postgresql.org; Tue, 12 Aug 2025 08:25:00 +0000 Received: from mail-ua1-x92d.google.com ([2607:f8b0:4864:20::92d]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1ulkJW-000ASB-0O for pgsql-general@lists.postgresql.org; Tue, 12 Aug 2025 08:25:00 +0000 Received: by mail-ua1-x92d.google.com with SMTP id a1e0cc1a2514c-88e38431d8eso413534241.0 for ; Tue, 12 Aug 2025 01:24:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1754987096; x=1755591896; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=Dz0nGUQYX/MddH++XMQj//Q3CBA0YgI/6B1YJorxgPM=; b=HUX5L5RVjP4DULCvP29IsRDkSTIvVsUmbyohF36tmvqeboEFlddLAufFnv7wFi3IwU RWWt1t6iR+lW4EhsLpVPDiwSrkjFuM0o+EJw8A6HUa/yqSsZ9Sp1/AbqHZzusoQp0dnk mblo4FdyEm+DBuD6idTvw5Sk5RNpORE7W279jDcGOsWIPHPWUfXkA0e/e4XROe2rsN1y lRP6aanEBWtDpKDCNofBECoPf/PV2FU22fcNrfotQFVCGXltu9hloVf4T1++kWEL5qHU WeJXv+EKcbnm0HK45xMDBYCMxSvYHQ81FdtKzFDiHg4JIOGZYj2YShAwa2CGrL4IJ8HS trEA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1754987096; x=1755591896; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Dz0nGUQYX/MddH++XMQj//Q3CBA0YgI/6B1YJorxgPM=; b=O0Nv+tPD88v74aMRQRgNxP6y2Kc1XXmQiwGx4CbcQGQ5KCQdlBXaoB3sAD97vS+gEt vpz5ZtkNBYFHI3S9PXpJdzowk/WZz4tlaD9aqIoKnYOefNK6tOOlIROZCKZ2Nzx3O1F8 8c3Nl6Z2LW8yW2tFdDEsxb/5iJBLCPpxzgzKiMP7TKndd4TUPnuISax2Mcg8F0kewWwq FhmkZc0d6iuUTEf7mvP+ywteOQsolFF28QpH3B8K1BQN6N5h9ne7ykjOghsRqwdwA+ge F1rPLGD2VrtJE39xndzPxFa5VY36SrWQ6TljlDu1jCV4e2taniNE6RAkXIDA2RFSs5S6 OLuA== X-Gm-Message-State: AOJu0YwrY782ulI0lLynjUAwhYipoc4RyXqHuQjuh3dJ34eXo+bt0ml+ wKmxxbhWfdgI+fX4hW7Omsp5RTiHumCiUxu6IPRnJ2RIGKKgodDjL4MOAU9YsnayYZCF6FtnUNW J3B/vOdAsYhzuRuW7rCh3lwkGex+pwi4= X-Gm-Gg: ASbGnctfjWjFNpV8vRPlGBFArHfLQ0JvISzhoWWSRzXN2t5HGTmxuRN74S06ha/nkWB Cm0F1+oF4WrkpkX/0fGV9gGedNpBJTOYMkQy5iiUa3BUopJCXzkwzEiiY7c07SAPVjl42iWia0N 9C0PiSEjg+orj4aXLER5vF9e6Khl/Wh/99x8YxN4mzOvMJ/O6qRQQgeogDFwH8ch4zEsSBmluxC Mdaeqz4tq5cKmfWU1bRG0+hGZGB7SctaEyvXo4S X-Google-Smtp-Source: AGHT+IGHfJUzjmjA7C3br7YJQ+Yd2pBjKKZKgtXX5Ep0VgEFVyGeFDhFG6ppV3b9qCmptM/Ex5O1B/dQskIrzyU88Ak= X-Received: by 2002:a05:6102:5816:b0:4e5:980a:d164 with SMTP id ada2fe7eead31-50cba41ef3dmr1227409137.0.1754987096041; Tue, 12 Aug 2025 01:24:56 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: px shi Date: Tue, 12 Aug 2025 16:24:44 +0800 X-Gm-Features: Ac12FXy3WgnOiok9ry8W5otSPhrBBOMaK0BP-YpG3Qpvej_3GHh65rDP1LRJXOg Message-ID: Subject: Re: Questions about the continuity of WAL archiving To: Adrian Klaver Cc: pgsql-general@lists.postgresql.org Content-Type: multipart/alternative; boundary="00000000000071ecb1063c26c66e" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --00000000000071ecb1063c26c66e Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable > 1) What is the current archiving setup on the primary and why is lagging? The archive command uses pgBackRest to archive to S3. Because it is uploaded to S3, the archiving speed is slow, which has caused lagging. 2) Have you looked at archiving off the standby node while it is in standby > per: Yes, archiving on the standby node is disabled. Is it recommended to share the WAL archive between the primary and standby nodes to avoid interruptions in archiving? Adrian Klaver =E4=BA=8E2025=E5=B9=B48=E6=9C=888= =E6=97=A5=E5=91=A8=E4=BA=94 23:23=E5=86=99=E9=81=93=EF=BC=9A > On 8/7/25 22:50, px shi wrote: > > Thank you for your reply. > > The archived files can be used for PITR (Point-In-Time Recovery), > > allowing recovery to any point between WAL 80 and 100 on timeline 1. > > Additionally, if there's a backup taken during timeline 1 and a > > switchover to a new primary has occurred without taking a new full > > backup yet, these WAL logs can still be used to recover to any point on > > timeline 2. > > Alright I see. > > Two things: > > 1) What is the current archiving setup on the primary and why is lagging? > > 2) Have you looked at archiving off the standby node while it is in > standby per: > > > https://www.postgresql.org/docs/current/warm-standby.html#CONTINUOUS-ARCH= IVING-IN-STANDBY > > > > > Regards, > > Pixian Shi > > > > Adrian Klaver > > =E4=BA=8E2025=E5=B9=B48=E6=9C=888= =E6=97=A5=E5=91=A8=E4=BA=94 12:25=E5=86=99=E9=81=93=EF=BC=9A > > > > On 8/7/25 20:20, px shi wrote: > > > Hi, > > > There is a scenario: the current timeline of the PostgreSQL > > primary node > > > is 1, and the latest WAL file is 100. The standby node has also > > received > > > up to WAL file 100. However, the latest WAL file archived is onl= y > > file > > > 80. If the primary node crashes at this point and the standby is > > > promoted to the new primary, archiving will resume from file 100 > on > > > timeline 2. As a result, WAL files from 81 to 100 on timeline 1 > > will be > > > missing from the archive. > > > > What are you planning to do with the archived files? > > > > Also is not the case that once the primary crashes you are in a spl= it > > brain case and can't really trust it's timeline anymore? > > > > > > > Is there a good solution to prevent this situation? > > > > > > Regards, > > > Pixian Shi > > > > > > -- > > Adrian Klaver > > adrian.klaver@aklaver.com > > > > > -- > Adrian Klaver > adrian.klaver@aklaver.com > --00000000000071ecb1063c26c66e Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

1) What is the current archiving setu= p on the primary and why is lagging?
=C2=A0The archive com= mand uses pgBackRest to archive to S3. Because it is uploaded to S3, the ar= chiving speed is slow, which has caused lagging.

2) Have you looked at archiving off the standby node while it is= in standby per:
Yes, archiving on the standby node is dis= abled. Is it recommended to share the WAL archive between the primary and s= tandby nodes to avoid interruptions in archiving?

Adrian Klaver <adrian= .klaver@aklaver.com> =E4=BA=8E2025=E5=B9=B48=E6=9C=888=E6=97=A5=E5= =91=A8=E4=BA=94 23:23=E5=86=99=E9=81=93=EF=BC=9A
= On 8/7/25 22:50, px shi wrote:
> Thank you for your reply.
> The archived files can be used for PITR (Point-In-Time Recovery),
> allowing recovery to any point between WAL 80 and 100 on timeline 1. > Additionally, if there's a backup taken during timeline 1 and a > switchover to a new primary has occurred without taking a new full > backup yet, these WAL logs can still be used to recover to any point o= n
> timeline 2.

Alright I see.

Two things:

1) What is the current archiving setup on the primary and why is lagging?
2) Have you looked at archiving off the standby node while it is in
standby per:

https://www= .postgresql.org/docs/current/warm-standby.html#CONTINUOUS-ARCHIVING-IN-STAN= DBY

>
> Regards,
> Pixian Shi
>
> Adrian Klaver <adrian.klaver@aklaver.com
> <mailto:adrian.klaver@aklaver.com>> =E4=BA=8E2025=E5=B9=B48=E6=9C=888= =E6=97=A5=E5=91=A8=E4=BA=94 12:25=E5=86=99=E9=81=93=EF=BC=9A
>
>=C2=A0 =C2=A0 =C2=A0On 8/7/25 20:20, px shi wrote:
>=C2=A0 =C2=A0 =C2=A0 > Hi,
>=C2=A0 =C2=A0 =C2=A0 > There is a scenario: the current timeline of = the PostgreSQL
>=C2=A0 =C2=A0 =C2=A0primary node
>=C2=A0 =C2=A0 =C2=A0 > is 1, and the latest WAL file is 100. The sta= ndby node has also
>=C2=A0 =C2=A0 =C2=A0received
>=C2=A0 =C2=A0 =C2=A0 > up to WAL file 100. However, the latest WAL f= ile archived is only
>=C2=A0 =C2=A0 =C2=A0file
>=C2=A0 =C2=A0 =C2=A0 > 80. If the primary node crashes at this point= and the standby is
>=C2=A0 =C2=A0 =C2=A0 > promoted to the new primary, archiving will r= esume from file 100 on
>=C2=A0 =C2=A0 =C2=A0 > timeline 2. As a result, WAL files from 81 to= 100 on timeline 1
>=C2=A0 =C2=A0 =C2=A0will be
>=C2=A0 =C2=A0 =C2=A0 > missing from the archive.
>
>=C2=A0 =C2=A0 =C2=A0What are you planning to do with the archived files= ?
>
>=C2=A0 =C2=A0 =C2=A0Also is not the case that once the primary crashes = you are in a split
>=C2=A0 =C2=A0 =C2=A0brain case and can't really trust it's time= line anymore?
>
>
>=C2=A0 =C2=A0 =C2=A0 > Is there a good solution to prevent this situ= ation?
>=C2=A0 =C2=A0 =C2=A0 >
>=C2=A0 =C2=A0 =C2=A0 > Regards,
>=C2=A0 =C2=A0 =C2=A0 > Pixian Shi
>
>
>=C2=A0 =C2=A0 =C2=A0--
>=C2=A0 =C2=A0 =C2=A0Adrian Klaver
>=C2=A0 =C2=A0 =C2=A0adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com><= br> >


--
Adrian Klaver
adrian.klave= r@aklaver.com
--00000000000071ecb1063c26c66e--