Re: Questions about the continuity of WAL archiving

public inbox for [email protected]  
help / color / mirror / Atom feed

Re: Questions about the continuity of WAL archiving
12+ messages / 6 participants
[nested] [flat]

* Re: Questions about the continuity of WAL archiving
@ 2025-08-12 08:24  px shi <[email protected]>
  0 siblings, 1 reply; 12+ messages in thread

From: px shi @ 2025-08-12 08:24 UTC (permalink / raw)
  To: Adrian Klaver <[email protected]>; +Cc: [email protected]

> 1) What is the current archiving setup on the primary and why is lagging?

 The archive command uses pgBackRest to archive to S3. Because it is
uploaded to S3, the archiving speed is slow, which has caused lagging.

2) Have you looked at archiving off the standby node while it is in standby
> per:

Yes, archiving on the standby node is disabled. Is it recommended to share
the WAL archive between the primary and standby nodes to avoid
interruptions in archiving?

Adrian Klaver <[email protected]> 于2025年8月8日周五 23:23写道：

> On 8/7/25 22:50, px shi wrote:
> > Thank you for your reply.
> > The archived files can be used for PITR (Point-In-Time Recovery),
> > allowing recovery to any point between WAL 80 and 100 on timeline 1.
> > Additionally, if there's a backup taken during timeline 1 and a
> > switchover to a new primary has occurred without taking a new full
> > backup yet, these WAL logs can still be used to recover to any point on
> > timeline 2.
>
> Alright I see.
>
> Two things:
>
> 1) What is the current archiving setup on the primary and why is lagging?
>
> 2) Have you looked at archiving off the standby node while it is in
> standby per:
>
>
> https://www.postgresql.org/docs/current/warm-standby.html#CONTINUOUS-ARCHIVING-IN-STANDBY
>
> >
> > Regards,
> > Pixian Shi
> >
> > Adrian Klaver <[email protected]
> > <mailto:[email protected]>> 于2025年8月8日周五 12:25写道：
> >
> >     On 8/7/25 20:20, px shi wrote:
> >      > Hi,
> >      > There is a scenario: the current timeline of the PostgreSQL
> >     primary node
> >      > is 1, and the latest WAL file is 100. The standby node has also
> >     received
> >      > up to WAL file 100. However, the latest WAL file archived is only
> >     file
> >      > 80. If the primary node crashes at this point and the standby is
> >      > promoted to the new primary, archiving will resume from file 100
> on
> >      > timeline 2. As a result, WAL files from 81 to 100 on timeline 1
> >     will be
> >      > missing from the archive.
> >
> >     What are you planning to do with the archived files?
> >
> >     Also is not the case that once the primary crashes you are in a split
> >     brain case and can't really trust it's timeline anymore?
> >
> >
> >      > Is there a good solution to prevent this situation?
> >      >
> >      > Regards,
> >      > Pixian Shi
> >
> >
> >     --
> >     Adrian Klaver
> >     [email protected] <mailto:[email protected]>
> >
>
>
> --
> Adrian Klaver
> [email protected]
>


^ permalink  raw  reply  [nested|flat] 12+ messages in thread

* Re: Questions about the continuity of WAL archiving
@ 2025-08-12 16:14  Adrian Klaver <[email protected]>
  parent: px shi <[email protected]>
  0 siblings, 2 replies; 12+ messages in thread

From: Adrian Klaver @ 2025-08-12 16:14 UTC (permalink / raw)
  To: px shi <[email protected]>; +Cc: [email protected]

On 8/12/25 01:24, px shi wrote:
> 
>     1) What is the current archiving setup on the primary and why is
>     lagging?
> 
>   The archive command uses pgBackRest to archive to S3. Because it is 
> uploaded to S3, the archiving speed is slow, which has caused lagging.
> 
>     2) Have you looked at archiving off the standby node while it is in
>     standby per:
> 
> Yes, archiving on the standby node is disabled. Is it recommended to 
> share the WAL archive between the primary and standby nodes to avoid 
> interruptions in archiving?

Given that you are using a less then capable storage solution(S3) why do 
you think pushing the WAL from the standby to S3 would perform any 
better then what is happening with the primary WAL?

The solution is to use a more capable storage platform.

> 
> Adrian Klaver <[email protected] 
> <mailto:[email protected]>> 于2025年8月8日周五 23:23写道：
> 

-- 
Adrian Klaver
[email protected]






^ permalink  raw  reply  [nested|flat] 12+ messages in thread

* Re: Questions about the continuity of WAL archiving
@ 2025-08-12 17:40  Bob Jolliffe <[email protected]>
  parent: Adrian Klaver <[email protected]>
  1 sibling, 1 reply; 12+ messages in thread

From: Bob Jolliffe @ 2025-08-12 17:40 UTC (permalink / raw)
  To: [email protected]

On Tue, 12 Aug 2025 at 17:14, Adrian Klaver <[email protected]>
wrote:

> On 8/12/25 01:24, px shi wrote:
> >
> >     1) What is the current archiving setup on the primary and why is
> >     lagging?
> >
> >   The archive command uses pgBackRest to archive to S3. Because it is
> > uploaded to S3, the archiving speed is slow, which has caused lagging.
> >
> >     2) Have you looked at archiving off the standby node while it is in
> >     standby per:
> >
> > Yes, archiving on the standby node is disabled. Is it recommended to
> > share the WAL archive between the primary and standby nodes to avoid
> > interruptions in archiving?
>
> Given that you are using a less then capable storage solution(S3) why do
> you think pushing the WAL from the standby to S3 would perform any
> better then what is happening with the primary WAL?
>
> The solution is to use a more capable storage platform.
>

That is an interesting point you make Adrian.  S3 seems quite popular for
this type of archiving.  What would you suggest as a more capable (and cost
effective) storage platform?

Regards
Bob

>
> >
> > Adrian Klaver <[email protected]
> > <mailto:[email protected]>> 于2025年8月8日周五 23:23写道：
> >
>
> --
> Adrian Klaver
> [email protected]
>
>
>


^ permalink  raw  reply  [nested|flat] 12+ messages in thread

* Re: Questions about the continuity of WAL archiving
@ 2025-08-12 19:19  Adrian Klaver <[email protected]>
  parent: Bob Jolliffe <[email protected]>
  0 siblings, 1 reply; 12+ messages in thread

From: Adrian Klaver @ 2025-08-12 19:19 UTC (permalink / raw)
  To: Bob Jolliffe <[email protected]>; [email protected]

On 8/12/25 10:40, Bob Jolliffe wrote:
> On Tue, 12 Aug 2025 at 17:14, Adrian Klaver <[email protected] 
> <mailto:[email protected]>> wrote:

>     The solution is to use a more capable storage platform.
> 
> 
> That is an interesting point you make Adrian.  S3 seems quite popular 
> for this type of archiving.  What would you suggest as a more capable 

Yes but from here:

https://pgbackrest.org/user-guide-rhel.html#s3-support

File creation time in S3 is relatively slow so backup/restore 
performance is improved by enabling file bundling.

Where file bundling is explained here:

https://pgbackrest.org/user-guide-rhel.html#backup/bundle

Though I don't think would help in this case.

> (and cost effective) storage platform?

I would say anything that does not use object storage and instead uses 
block storage, so you are not doing the conversion. I have no specific 
recommendations as this is not something I do, archive to the cloud.

> 
> Regards
> Bob
> 
-- 
Adrian Klaver
[email protected]

^ permalink  raw  reply  [nested|flat] 12+ messages in thread

* Re: Questions about the continuity of WAL archiving
@ 2025-08-13 02:05  px shi <[email protected]>
  parent: Adrian Klaver <[email protected]>
  1 sibling, 1 reply; 12+ messages in thread

From: px shi @ 2025-08-13 02:05 UTC (permalink / raw)
  To: Adrian Klaver <[email protected]>; +Cc: [email protected]

Hi, Adrian

Given that you are using a less then capable storage solution(S3) why do
> you think pushing the WAL from the standby to S3 would perform any
> better then what is happening with the primary WAL?
>

I mean that archive_mode is set to on in primary and set to always in
standby.
This way, even if the primary crashes, the standby can still archive WAL
files that the primary did not archive.

The solution is to use a more capable storage platform.
>

 However, I believe that even if we use a more capable storage platform, it
is still impossible to archive WAL files in real time. As long as real-time
archiving cannot be achieved, there will always be some WAL files that are
not archived if the primary node crashes.

Adrian Klaver <[email protected]> 于2025年8月13日周三 00:14写道：

> On 8/12/25 01:24, px shi wrote:
> >
> >     1) What is the current archiving setup on the primary and why is
> >     lagging?
> >
> >   The archive command uses pgBackRest to archive to S3. Because it is
> > uploaded to S3, the archiving speed is slow, which has caused lagging.
> >
> >     2) Have you looked at archiving off the standby node while it is in
> >     standby per:
> >
> > Yes, archiving on the standby node is disabled. Is it recommended to
> > share the WAL archive between the primary and standby nodes to avoid
> > interruptions in archiving?
>
> Given that you are using a less then capable storage solution(S3) why do
> you think pushing the WAL from the standby to S3 would perform any
> better then what is happening with the primary WAL?
>
> The solution is to use a more capable storage platform.
>
> >
> > Adrian Klaver <[email protected]
> > <mailto:[email protected]>> 于2025年8月8日周五 23:23写道：
> >
>
> --
> Adrian Klaver
> [email protected]
>


^ permalink  raw  reply  [nested|flat] 12+ messages in thread

* Re: Questions about the continuity of WAL archiving
@ 2025-08-13 02:10  Ron Johnson <[email protected]>
  parent: px shi <[email protected]>
  0 siblings, 1 reply; 12+ messages in thread

From: Ron Johnson @ 2025-08-13 02:10 UTC (permalink / raw)
  To: pgsql-generallists.postgresql.org <[email protected]>

How often does your primary node crash, and then not recover due to WALs
corruption or WALs not existing?

If it's _ever_ happened, you should _fix that_ instead of rolling your own
WAL archival.process.

On Tue, Aug 12, 2025 at 10:05 PM px shi <[email protected]> wrote:

> Hi, Adrian
>
> Given that you are using a less then capable storage solution(S3) why do
>> you think pushing the WAL from the standby to S3 would perform any
>> better then what is happening with the primary WAL?
>>
>
> I mean that archive_mode is set to on in primary and set to always in
> standby.
> This way, even if the primary crashes, the standby can still archive WAL
> files that the primary did not archive.
>
> The solution is to use a more capable storage platform.
>>
>
>  However, I believe that even if we use a more capable storage platform,
> it is still impossible to archive WAL files in real time. As long as
> real-time archiving cannot be achieved, there will always be some WAL files
> that are not archived if the primary node crashes.
>
> Adrian Klaver <[email protected]> 于2025年8月13日周三 00:14写道：
>
>> On 8/12/25 01:24, px shi wrote:
>> >
>> >     1) What is the current archiving setup on the primary and why is
>> >     lagging?
>> >
>> >   The archive command uses pgBackRest to archive to S3. Because it is
>> > uploaded to S3, the archiving speed is slow, which has caused lagging.
>> >
>> >     2) Have you looked at archiving off the standby node while it is in
>> >     standby per:
>> >
>> > Yes, archiving on the standby node is disabled. Is it recommended to
>> > share the WAL archive between the primary and standby nodes to avoid
>> > interruptions in archiving?
>>
>> Given that you are using a less then capable storage solution(S3) why do
>> you think pushing the WAL from the standby to S3 would perform any
>> better then what is happening with the primary WAL?
>>
>> The solution is to use a more capable storage platform.
>>
>> >
>> > Adrian Klaver <[email protected]
>> > <mailto:[email protected]>> 于2025年8月8日周五 23:23写道：
>> >
>>
>> --
>> Adrian Klaver
>> [email protected]
>>
>

-- 
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!


^ permalink  raw  reply  [nested|flat] 12+ messages in thread

* Re: Questions about the continuity of WAL archiving
@ 2025-08-13 02:24  px shi <[email protected]>
  parent: Ron Johnson <[email protected]>
  0 siblings, 1 reply; 12+ messages in thread

From: px shi @ 2025-08-13 02:24 UTC (permalink / raw)
  To: Ron Johnson <[email protected]>; +Cc: pgsql-generallists.postgresql.org <[email protected]>

>
> How often does your primary node crash, and then not recover due to WALs
> corruption or WALs not existing?
>
> If it's _ever_ happened, you should _fix that_ instead of rolling your own
> WAL archival.process.
>

 I once encountered a case where the recovery process failed to restore to
the latest LSN due to missing WAL files in the archive. The root cause was
multiple failovers between primary and standby. During one of the
switchovers, the primary crashed before completing the archiving of all WAL
files. When the standby was promoted to primary, it began archiving WAL
files for the new timeline, resulting in a gap between the WAL files of the
two timelines. Moreover, no base backup was taken during this period.


Ron Johnson <[email protected]> 于2025年8月13日周三 10:11写道：

> How often does your primary node crash, and then not recover due to WALs
> corruption or WALs not existing?
>
> If it's _ever_ happened, you should _fix that_ instead of rolling your own
> WAL archival.process.
>
> On Tue, Aug 12, 2025 at 10:05 PM px shi <[email protected]> wrote:
>
>> Hi, Adrian
>>
>> Given that you are using a less then capable storage solution(S3) why do
>>> you think pushing the WAL from the standby to S3 would perform any
>>> better then what is happening with the primary WAL?
>>>
>>
>> I mean that archive_mode is set to on in primary and set to always in
>> standby.
>> This way, even if the primary crashes, the standby can still archive WAL
>> files that the primary did not archive.
>>
>> The solution is to use a more capable storage platform.
>>>
>>
>>  However, I believe that even if we use a more capable storage platform,
>> it is still impossible to archive WAL files in real time. As long as
>> real-time archiving cannot be achieved, there will always be some WAL files
>> that are not archived if the primary node crashes.
>>
>> Adrian Klaver <[email protected]> 于2025年8月13日周三 00:14写道：
>>
>>> On 8/12/25 01:24, px shi wrote:
>>> >
>>> >     1) What is the current archiving setup on the primary and why is
>>> >     lagging?
>>> >
>>> >   The archive command uses pgBackRest to archive to S3. Because it is
>>> > uploaded to S3, the archiving speed is slow, which has caused lagging.
>>> >
>>> >     2) Have you looked at archiving off the standby node while it is in
>>> >     standby per:
>>> >
>>> > Yes, archiving on the standby node is disabled. Is it recommended to
>>> > share the WAL archive between the primary and standby nodes to avoid
>>> > interruptions in archiving?
>>>
>>> Given that you are using a less then capable storage solution(S3) why do
>>> you think pushing the WAL from the standby to S3 would perform any
>>> better then what is happening with the primary WAL?
>>>
>>> The solution is to use a more capable storage platform.
>>>
>>> >
>>> > Adrian Klaver <[email protected]
>>> > <mailto:[email protected]>> 于2025年8月8日周五 23:23写道：
>>> >
>>>
>>> --
>>> Adrian Klaver
>>> [email protected]
>>>
>>
>
> --
> Death to <Redacted>, and butter sauce.
> Don't boil me, I'm still alive.
> <Redacted> lobster!
>


^ permalink  raw  reply  [nested|flat] 12+ messages in thread

* Re: Questions about the continuity of WAL archiving
@ 2025-08-13 02:50  Justin <[email protected]>
  parent: px shi <[email protected]>
  0 siblings, 1 reply; 12+ messages in thread

From: Justin @ 2025-08-13 02:50 UTC (permalink / raw)
  To: px shi <[email protected]>; +Cc: Ron Johnson <[email protected]>; pgsql-generallists.postgresql.org <[email protected]>

On Tue, Aug 12, 2025 at 10:24 PM px shi <[email protected]> wrote:

> How often does your primary node crash, and then not recover due to WALs
>> corruption or WALs not existing?
>>
>> If it's _ever_ happened, you should _fix that_ instead of rolling your
>> own WAL archival.process.
>>
>
>  I once encountered a case where the recovery process failed to restore to
> the latest LSN due to missing WAL files in the archive. The root cause was
> multiple failovers between primary and standby. During one of the
> switchovers, the primary crashed before completing the archiving of all WAL
> files. When the standby was promoted to primary, it began archiving WAL
> files for the new timeline, resulting in a gap between the WAL files of the
> two timelines. Moreover, no base backup was taken during this period.
>
>

I am not sure what the problem is  here either, other than something
seriously wrong with configuration with PostgreSQL and PgBackrest.

The replica should be receiving the WAL via a replication slot using
Streaming.  Meaning the primary will keep the WAL until the replica is
caught up,  if the replica becomes disconnected due to
max_slot_wal_keep_size aka wal_keep_segments  is exceeded the replicas
recovery_command can take offer and fetch from the WAL Archive to catch the
replica up.  This assumes hot_feedback is on so the WAL replay won't become
delayed due to snapshot locks on the replica.

If  all the above is true the replica should never lag behind unless the
disk IO layer is way undersized compared to the Primary.  S3 is being
talked about  so it makes me wonder about DISK IO configuration on the
primary vs the replica.  I see this causing lag under high load where the
replica IO layer is the bottleneck.

If PgBackrest can't keep up with WAL archiving, as others have stated  you
need to configure Asynchronous Archiving. The number of workers depends on
the load. I have a server running 8 parallel workers to archive 1TB of WAL
daily....  And another server  during maintenance tasks generates around
10,000 WAL files in about 2 hours using 6 PgBAckrest workers  All to S3
buckets.

The above statement makes me wonder if there is some kind of High
Availability monitor running like pg_autofailover, that is promoting a
replica then  converting the former primary to a replica of the recently
"promoted replica"

If the above matches to what is happening, it is very easy to mess up the
configuration for WAL archiving and backups. Part of the process of
promoting a replica is to make sure WAL archiving is working.  The replica
after being promoted immediately kicks of autovacuum to rebuild things like
FSM which generates a lot of WAL files.

If you are losing  WAL files the configuration is wrong somewhere..

Just not enough information on the series of events and the configuration
to tell what the root cause is other than miss-configuration.

Thanks
Justin

^ permalink  raw  reply  [nested|flat] 12+ messages in thread

* Re: Questions about the continuity of WAL archiving
@ 2025-08-13 05:48  px shi <[email protected]>
  parent: Justin <[email protected]>
  0 siblings, 2 replies; 12+ messages in thread

From: px shi @ 2025-08-13 05:48 UTC (permalink / raw)
  To: Justin <[email protected]>; +Cc: Ron Johnson <[email protected]>; pgsql-generallists.postgresql.org <[email protected]>

Here’s a scenario: The latest WAL file on the primary node is
0000000100000000000000AF, and the standby node has also received up to
0000000100000000000000AF. However, the latest WAL file that has been
successfully archived from the primary is only 0000000100000000000000A1
(WAL files from A2 to AE have not yet been archived). If the primary
crashes at this point, triggering a failover, the new primary will start
generating and archiving WAL on a new timeline (2), beginning with
0000000200000000000000AF. It will not backfill the missing WAL files from
timeline 1 (0000000100000000000000A2 to 0000000100000000000000AE). As a
result, while the new primary does not have any local WAL gaps, the archive
directory will contain a gap in that WAL range.
I’m not sure if I explained it clearly.


Justin <[email protected]> 于2025年8月13日周三 10:51写道：

>
>
> On Tue, Aug 12, 2025 at 10:24 PM px shi <[email protected]> wrote:
>
>> How often does your primary node crash, and then not recover due to WALs
>>> corruption or WALs not existing?
>>>
>>> If it's _ever_ happened, you should _fix that_ instead of rolling your
>>> own WAL archival.process.
>>>
>>
>>  I once encountered a case where the recovery process failed to restore
>> to the latest LSN due to missing WAL files in the archive. The root cause
>> was multiple failovers between primary and standby. During one of the
>> switchovers, the primary crashed before completing the archiving of all WAL
>> files. When the standby was promoted to primary, it began archiving WAL
>> files for the new timeline, resulting in a gap between the WAL files of the
>> two timelines. Moreover, no base backup was taken during this period.
>>
>>
>
> I am not sure what the problem is  here either, other than something
> seriously wrong with configuration with PostgreSQL and PgBackrest.
>
> The replica should be receiving the WAL via a replication slot using
> Streaming.  Meaning the primary will keep the WAL until the replica is
> caught up,  if the replica becomes disconnected due to
> max_slot_wal_keep_size aka wal_keep_segments  is exceeded the replicas
> recovery_command can take offer and fetch from the WAL Archive to catch the
> replica up.  This assumes hot_feedback is on so the WAL replay won't become
> delayed due to snapshot locks on the replica.
>
> If  all the above is true the replica should never lag behind unless the
> disk IO layer is way undersized compared to the Primary.  S3 is being
> talked about  so it makes me wonder about DISK IO configuration on the
> primary vs the replica.  I see this causing lag under high load where the
> replica IO layer is the bottleneck.
>
> If PgBackrest can't keep up with WAL archiving, as others have stated  you
> need to configure Asynchronous Archiving. The number of workers depends on
> the load. I have a server running 8 parallel workers to archive 1TB of WAL
> daily....  And another server  during maintenance tasks generates around
> 10,000 WAL files in about 2 hours using 6 PgBAckrest workers  All to S3
> buckets.
>
> The above statement makes me wonder if there is some kind of High
> Availability monitor running like pg_autofailover, that is promoting a
> replica then  converting the former primary to a replica of the recently
> "promoted replica"
>
> If the above matches to what is happening, it is very easy to mess up the
> configuration for WAL archiving and backups. Part of the process of
> promoting a replica is to make sure WAL archiving is working.  The replica
> after being promoted immediately kicks of autovacuum to rebuild things like
> FSM which generates a lot of WAL files.
>
> If you are losing  WAL files the configuration is wrong somewhere..
>
> Just not enough information on the series of events and the configuration
> to tell what the root cause is other than miss-configuration.
>
>
> Thanks
> Justin
>
>
>
>


^ permalink  raw  reply  [nested|flat] 12+ messages in thread

* Re: Questions about the continuity of WAL archiving
@ 2025-08-13 14:05  Justin <[email protected]>
  parent: px shi <[email protected]>
  1 sibling, 0 replies; 12+ messages in thread

From: Justin @ 2025-08-13 14:05 UTC (permalink / raw)
  To: px shi <[email protected]>; +Cc: Ron Johnson <[email protected]>; pgsql-generallists.postgresql.org <[email protected]>

>
>
> Justin <[email protected]> 于2025年8月13日周三 10:51写道：
>
>>
>>
>> On Tue, Aug 12, 2025 at 10:24 PM px shi <[email protected]> wrote:
>>
>>> How often does your primary node crash, and then not recover due to WALs
>>>> corruption or WALs not existing?
>>>>
>>>> If it's _ever_ happened, you should _fix that_ instead of rolling your
>>>> own WAL archival.process.
>>>>
>>>
>>>  I once encountered a case where the recovery process failed to restore
>>> to the latest LSN due to missing WAL files in the archive. The root cause
>>> was multiple failovers between primary and standby. During one of the
>>> switchovers, the primary crashed before completing the archiving of all WAL
>>> files. When the standby was promoted to primary, it began archiving WAL
>>> files for the new timeline, resulting in a gap between the WAL files of the
>>> two timelines. Moreover, no base backup was taken during this period.
>>>
>>>
>>
>> I am not sure what the problem is  here either, other than something
>> seriously wrong with configuration with PostgreSQL and PgBackrest.
>>
>> The replica should be receiving the WAL via a replication slot using
>> Streaming.  Meaning the primary will keep the WAL until the replica is
>> caught up,  if the replica becomes disconnected due to
>> max_slot_wal_keep_size aka wal_keep_segments  is exceeded the replicas
>> recovery_command can take offer and fetch from the WAL Archive to catch the
>> replica up.  This assumes hot_feedback is on so the WAL replay won't become
>> delayed due to snapshot locks on the replica.
>>
>> If  all the above is true the replica should never lag behind unless the
>> disk IO layer is way undersized compared to the Primary.  S3 is being
>> talked about  so it makes me wonder about DISK IO configuration on the
>> primary vs the replica.  I see this causing lag under high load where the
>> replica IO layer is the bottleneck.
>>
>> If PgBackrest can't keep up with WAL archiving, as others have stated
>> you need to configure Asynchronous Archiving. The number of workers depends
>> on the load. I have a server running 8 parallel workers to archive 1TB of
>> WAL daily....  And another server  during maintenance tasks
>> generates around 10,000 WAL files in about 2 hours using 6 PgBAckrest
>> workers  All to S3 buckets.
>>
>> The above statement makes me wonder if there is some kind of High
>> Availability monitor running like pg_autofailover, that is promoting a
>> replica then  converting the former primary to a replica of the recently
>> "promoted replica"
>>
>> If the above matches to what is happening, it is very easy to mess up the
>> configuration for WAL archiving and backups. Part of the process of
>> promoting a replica is to make sure WAL archiving is working.  The replica
>> after being promoted immediately kicks of autovacuum to rebuild things like
>> FSM which generates a lot of WAL files.
>>
>> If you are losing  WAL files the configuration is wrong somewhere..
>>
>> Just not enough information on the series of events and the configuration
>> to tell what the root cause is other than miss-configuration.
>>
>>
>> Thanks
>> Justin
>>
>
On Wed, Aug 13, 2025 at 1:48 AM px shi <[email protected]> wrote:

> Here’s a scenario: The latest WAL file on the primary node is
> 0000000100000000000000AF, and the standby node has also received up to
> 0000000100000000000000AF. However, the latest WAL file that has been
> successfully archived from the primary is only 0000000100000000000000A1
> (WAL files from A2 to AE have not yet been archived). If the primary
> crashes at this point, triggering a failover, the new primary will start
> generating and archiving WAL on a new timeline (2), beginning with
> 0000000200000000000000AF. It will not backfill the missing WAL files from
> timeline 1 (0000000100000000000000A2 to 0000000100000000000000AE). As a
> result, while the new primary does not have any local WAL gaps, the archive
> directory will contain a gap in that WAL range.
> I’m not sure if I explained it clearly.
>
>

This will happen if the replica is lagging  and promoted before the replica
has had a chance to catch up.  This is  working correctly to the design
intent.   There are several tools available to tell us if the replica is
sync before promoting.  In the above case a lagging Replica was promoted,
it stops looking at the previous timeline and will NOT look for the missing
WAL files from the previous timeline. The replica does not even know they
exist anymore.

The data in the previous timeline is not accessible anymore from the
Promoted Replica; it is working on a new timeline.  The only place the old
timeline/missed WAL files are accessible is on the crashed primary, it
never archived or streamed the WAL files to the replica.

Promoting an out of sync/lagging replica  will result in loss of data.

Does this answer the question here?


^ permalink  raw  reply  [nested|flat] 12+ messages in thread

* Re: Questions about the continuity of WAL archiving
@ 2025-08-13 15:09  Adrian Klaver <[email protected]>
  parent: px shi <[email protected]>
  1 sibling, 0 replies; 12+ messages in thread

From: Adrian Klaver @ 2025-08-13 15:09 UTC (permalink / raw)
  To: px shi <[email protected]>; Justin <[email protected]>; +Cc: Ron Johnson <[email protected]>; pgsql-generallists.postgresql.org <[email protected]>

On 8/12/25 22:48, px shi wrote:
> Here’s a scenario: The latest WAL file on the primary node is 
> 0000000100000000000000AF, and the standby node has also received up to 
> 0000000100000000000000AF. However, the latest WAL file that has been 
> successfully archived from the primary is only 0000000100000000000000A1 
> (WAL files from A2 to AE have not yet been archived). If the primary 
> crashes at this point, triggering a failover, the new primary will start 
> generating and archiving WAL on a new timeline (2), beginning with 
> 0000000200000000000000AF. It will not backfill the missing WAL files 
> from timeline 1 (0000000100000000000000A2 to 0000000100000000000000AE). 
> As a result, while the new primary does not have any local WAL gaps, the 
> archive directory will contain a gap in that WAL range.
> I’m not sure if I explained it clearly.

Why does it matter?

1) Your standby is starting off up to date.

2) You can do a pg_basebackup from the new primary as a base for the 
restart of the old primary. Assuming you have archiving set up on the 
new primary then the restarted primary can catch up.

3) If you don't want to do 2) then you need an archive location that can 
deal with the velocity of the WAL archiving.

> 
> 
> Justin <[email protected] <mailto:[email protected]>> 于2025年8月 
> 13日周三 10:51写道：
> 
-- 
Adrian Klaver
[email protected]






^ permalink  raw  reply  [nested|flat] 12+ messages in thread

* Re: Questions about the continuity of WAL archiving
@ 2025-08-15 16:40  Greg Sabino Mullane <[email protected]>
  parent: Adrian Klaver <[email protected]>
  0 siblings, 0 replies; 12+ messages in thread

From: Greg Sabino Mullane @ 2025-08-15 16:40 UTC (permalink / raw)
  To: Adrian Klaver <[email protected]>; +Cc: Bob Jolliffe <[email protected]>; [email protected]

On Tue, Aug 12, 2025 at 3:20 PM Adrian Klaver <[email protected]>
wrote:

> File creation time in S3 is relatively slow so backup/restore
> performance is improved by enabling file bundling.
>

Just to be clear for the archives, pgbackrest's file bundling only applies
to backups, not to WAL, which is the OP is dealing with here.

Pixian Shi, what sort of WAL volume are we dealing with? (in WALs generated
per minute)

Cheers,
Greg

--
Crunchy Data - https://www.crunchydata.com
Enterprise Postgres Software Products & Tech Support

^ permalink  raw  reply  [nested|flat] 12+ messages in thread

end of thread, other threads:[~2025-08-15 16:40 UTC | newest]

Thread overview: 12+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2025-08-12 08:24 Re: Questions about the continuity of WAL archiving px shi <[email protected]>
2025-08-12 16:14 ` Adrian Klaver <[email protected]>
2025-08-12 17:40   ` Bob Jolliffe <[email protected]>
2025-08-12 19:19     ` Adrian Klaver <[email protected]>
2025-08-15 16:40       ` Greg Sabino Mullane <[email protected]>
2025-08-13 02:05   ` px shi <[email protected]>
2025-08-13 02:10     ` Ron Johnson <[email protected]>
2025-08-13 02:24       ` px shi <[email protected]>
2025-08-13 02:50         ` Justin <[email protected]>
2025-08-13 05:48           ` px shi <[email protected]>
2025-08-13 14:05             ` Justin <[email protected]>
2025-08-13 15:09             ` Adrian Klaver <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox