public inbox for [email protected]
help / color / mirror / Atom feedFrom: OMPRAKASH SAHU <[email protected]>
To: Shubhang Joshi <[email protected]>
Cc: Laurenz Albe <[email protected]>
Cc: [email protected]
Subject: Re: WAL replay is too slow on secondary server
Date: Fri, 31 Oct 2025 13:17:48 +0530
Message-ID: <CAOZWJqNR3dxnwn+HGPszQB8BY67_E=eoa7SzArL=t=PMOtUAMQ@mail.gmail.com> (raw)
In-Reply-To: <CAOJCrX-3S-afnX=DqTwb=+SS8-_0Gexqs_D+z12jNbg8xZ5ccw@mail.gmail.com>
References: <CAOZWJqPc+s_vA-UfWWLR0s6Mt+DCffjXXVyLHJNJiuMrDLTYcA@mail.gmail.com>
<CAOJCrX91Xf3HU5J0Vn_FdrRDpMevNiZUEN3oAWwk4J1H0ibo-Q@mail.gmail.com>
<[email protected]>
<CAOJCrX-3S-afnX=DqTwb=+SS8-_0Gexqs_D+z12jNbg8xZ5ccw@mail.gmail.com>
Hi Everyone,
Thankyou for the suggestions.
I have changed few things from DB side on secondary only till yesterday it
seems fine I will be monitoring it further
Below are the changes:
wal_decode_buffer_size
maintenance_io_concurrency
bgwriter_delay
I checked with AWS support as well if micro bursting had happening but
allocation is enough as per them.
Regards,
OM
On Fri, 31 Oct 2025, 09:54 Shubhang Joshi, <[email protected]>
wrote:
> Hi OM,
> Hi Laurenz,
>
> Thank you for your insights.
>
> I apologize for my previous suggestion regarding network speed; upon
> further review, it was not the correct cause in this scenario.
>
> Based on the current observations and system metrics, the accumulation of
> WAL on the standby server points to disk I/O limitations during replay—not
> network speed. CPU and RAM usage remain low, and WAL traffic is reaching
> the replica without delay, but replay/apply on disk is slow.
>
> The root cause appears to be disk subsystem performance and the
> single-threaded nature of WAL replay in PostgreSQL recovery. Optimizing
> disk throughput or reconfiguring memory may help, but network latency does
> not seem to be affecting this scenario.
>
> Regards,
> Shubhang
>
> On Thu, 30 Oct 2025 at 17:45, Laurenz Albe <[email protected]>
> wrote:
>
>> On Thu, 2025-10-30 at 17:08 +0530, Shubhang Joshi wrote:
>> > On Thu, 30 Oct, 2025, 10:07 am OMPRAKASH SAHU, <[email protected]>
>> wrote:
>> > > We have a postgresql cluster setup using patroni.
>> > > The DB is being used for heavy transactional application, now the
>> problem is that on replica server WAL replay is too slow.
>> > > We have increased the IOPS to 6k and Throughput to 600 on nvme EBS
>> volume of wal directory and 10k &800 on data directory.
>> > >
>> > > but the WAL is being accumulated on the replica as usual and applying
>> wal is having no improvement.
>> >
>> > Please check the network speed — we faced a similar issue earlier, and
>> it turned out to be related to network performance.
>> > Kindly verify the network latency with your network team as well.
>>
>> If WAL is piling up on the standby, how can network speed be the problem?
>>
>> Yours,
>> Laurenz Albe
>>
>
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected]
Subject: Re: WAL replay is too slow on secondary server
In-Reply-To: <CAOZWJqNR3dxnwn+HGPszQB8BY67_E=eoa7SzArL=t=PMOtUAMQ@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox