Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vy2at-00HN2J-11 for pgsql-general@arkaria.postgresql.org; Thu, 05 Mar 2026 06:53:59 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vy2aq-00GOiy-31 for pgsql-general@arkaria.postgresql.org; Thu, 05 Mar 2026 06:53:57 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vy2aq-00GOiq-1M for pgsql-general@lists.postgresql.org; Thu, 05 Mar 2026 06:53:57 +0000 Received: from mail-yx1-xb129.google.com ([2607:f8b0:4864:20::b129]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1vy2ao-00000000p0X-2SeH for pgsql-general@postgresql.org; Thu, 05 Mar 2026 06:53:56 +0000 Received: by mail-yx1-xb129.google.com with SMTP id 956f58d0204a3-64ca6595c8aso7445294d50.0 for ; Wed, 04 Mar 2026 22:53:54 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1772693632; cv=none; d=google.com; s=arc-20240605; b=Orr3TiFsgNDCCy0q79oxD0C2/n0IWarW1kfTUXT7AvaNhpDSZO//KNgma2SbxQpYbq S5wCt65fLfcjY3QTZjD7kwM/IbLUFv4GL8PaR7WEBrIhhf0SuK60n/FZAFSv8MJFE3LO qfJiLgksp3DqT64wTYDt5B5b9BY7zcO5l3+0su5HOZmbqRCYTXcbR0ImuiXRq3L8myA4 5Dk9irpFF1XeKmzPVPFMB7ppu3k2C0Aq3DtqqzreKrjjy0sH7J4C2+KM/PkBzS5kpmpI 8ZOZS/h6hFRS7gTbuR0PHr2kK5g/gWRSpx0bQlYcRgciTxmNxGVzbS4z1aGpOkOflBqd 8bsw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=to:subject:message-id:date:from:mime-version:dkim-signature; bh=o6sODBreoYWAPe6Qwvm91dHkLycWBjk+6mKhlTN5o6U=; fh=KNSq+t9BltSXFnT3Yof/aKGBtqxeA+bTALiYdvTslaY=; b=lf35wIBT6Yh2rMoYFCWbRPt2ViFcDpgFVW5TpzdCj0v4ShwD6i7ljhho7dMExg450s vejLvVLPXfRlif9zKvEENVoO0MnjljssrtFYlwPUfbvvTT+UiXOWtPCB8rn/mJeHEnV+ lHzIZyimVNWHWWZNc1vJj5jpRyZBy8i3RxlH0EFxNjfAUJYPBf9R90ZH+csq8o0prYjm qXtxi1KZsrQoJfHsaBmsy0B484McFxv6tMij+3K1V/QWH98RYOeB4hukmswzsKtoYFy7 xL5hmWaiUwEHa1LwzdJEYqOjxasOOYXz7PCrqaVPm0YHgnKJoOSfVyao/0nukm4Tl5jo cxbA==; darn=postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1772693632; x=1773298432; darn=postgresql.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=o6sODBreoYWAPe6Qwvm91dHkLycWBjk+6mKhlTN5o6U=; b=hnNsnDqG5jqzqK40c4znPRYOaKyPwr+5qKaGmDKSyw5jtHytJeewT53WzfLEyL62uy 5iK3v8T/cUaZiZTUJCMc0CZn9T2zAJb+ZVLILGCQYGiulsnO79phI+KOduyEjDMzTBAi t/fe+zBulcylE/uTHgGjJ1/4MMubEUicm9X3RLCHTanme/tjxuwfzXiF3BUxL9tA+Neo A3+n9j9gparQNv8D0kFvYMsy6MxwAF9EOrcNdZDv/laOOY/CTar7sL0EiIgxvkOVUvpb N1pze5jCbzjh4JNZyc12jjQ2PxKC9mAtEoN9XJK6Iy3dlG1vI9m3C7Faxpv59QchFLk6 csfw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1772693632; x=1773298432; h=to:subject:message-id:date:from:mime-version:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=o6sODBreoYWAPe6Qwvm91dHkLycWBjk+6mKhlTN5o6U=; b=SH5YqM1yaFC/db17njJmQUn67D7oN4RonWKCuQYf3sfO3oHgalDwtewOFPVW9LEhP2 952JNMfaar4/oN5ebl0tN5gvRYK/twquKIBhu2x6Gy1lr6PdmYIBHjaJZ3Vi/AaJ+qxU jxk8X1Q7Jdi77uMFeHzRSSrUoyR6L26FepZaFixhPBozevjEAdGGRvoP9qKN98YXcLH1 WL2fgPB1QOvchZ1PMSc1KxY2a/qx8+yLLZ7S8VaaMDGGyyiVFz5WOI9wR0B5VFkWFvuJ rVcnqcycCxJw8NGvBxy1JzRbuKFWvgozs9F7zv2EgzY23V19hVCnvk6WrnaHEsvN3OeP WHog== X-Gm-Message-State: AOJu0Yz/D1be9ItN4YSRhUK0wXb7A4igSLfapGrJ0AMjjIS98wjUauw8 1CSiWGHopZm4zkz0hgUHydWr4V+ju5sgW7ajpb1BhDs7fhIuw/bNAaoEqIV+ONU0tDOOefKZ6co 1yrLEwIk0jGi06fcqWaPzIztmRWmqD9I/YpFW4wQ= X-Gm-Gg: ATEYQzw87hUhnmDPUkTj74cfYHPZu2vA4m52F6z5vjNRAZVDuQOn3Ne3d+Rqdpvn/vo Al5BwLM98ZzHJn23VlvxhRJ6IuVDoxMLJYjbpQmj+h8Cu0O2LHDD+zWBmSx4w0EHwlaqi40d2/q Am5POv7dBgSR89PtIAILDJNU5z6DkPxu4jUzCwZQsLx3iLcaPPA/5RwQeFOYuldValhi+qzBN77 y7d+eA9w/IoKkIcXzm2X4ogJMDuZQIRHYLFzwmSjXPJb4oPyuglluuzrgR66OepWax5RONs7rpe gaDmBU6Y X-Received: by 2002:a05:690e:1501:b0:64c:9ec3:d71f with SMTP id 956f58d0204a3-64cf9bab3famr3439014d50.50.1772693632397; Wed, 04 Mar 2026 22:53:52 -0800 (PST) MIME-Version: 1.0 From: KK CHN Date: Thu, 5 Mar 2026 12:30:17 +0530 X-Gm-Features: AaiRm52yw6GgPfI50iScKSWZI0gYywUwGPOrm6FTYqX7I4SX8h6QueFitXuhVVI Message-ID: Subject: Heavy load on DB Cluster To: pgsql-general Content-Type: multipart/alternative; boundary="0000000000004134b3064c416613" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --0000000000004134b3064c416613 Content-Type: text/plain; charset="UTF-8" List, I am experiencing heavy load on my database cluster and DB server performance degrading over the time. vCPUs 16 , Mem 32 G Swap : 8G storage 5T RHEL 9.4 postgres 16 top - 11:55:18 up 175 days, 7:52, 3 users, load average: 11.07, 10.05, 9.56 Tasks: 731 total, 14 running, 717 sleeping, 0 stopped, 0 zombie %Cpu(s): 28.8 us, 9.3 sy, 0.0 ni, 44.9 id, 13.7 wa, 0.8 hi, 2.5 si, 0.0 st MiB Mem : 31837.6 total, 531.8 free, 14773.3 used, 25392.0 buff/cache MiB Swap: 8060.0 total, 5140.4 free, 2919.6 used. 17064.2 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3148533 postgre+ 20 0 8973544 7.3g 7.3g S 32.9 23.5 0:22.52 postgres 3150012 postgre+ 20 0 8991380 7.4g 7.4g S 32.9 23.9 0:26.16 postgres 3081907 postgre+ 20 0 9078400 8.0g 7.9g R 21.9 25.8 2:44.53 postgres 3125409 postgre+ 20 0 9075568 8.1g 8.0g S 21.3 26.0 2:34.63 postgres 3126500 postgre+ 20 0 9073928 8.0g 7.9g S 18.3 25.7 2:33.10 postgres 3081925 postgre+ 20 0 9059088 8.2g 8.2g S 17.6 26.5 6:38.79 postgres I have pgbackrest(2.52.1) running for incremental backups to a remote reposerver and local WAL replication configured to an onprem standalone instance on another VM in the same local LAN. archive_mode = on # enables archiving; off, on, or always # (change requires restart) # (empty string indicates archive_command should # be used) archive_command = 'pgbackrest --stanza=My_Repo archive-push %p' *To identify the resource consuming queries I ran and found only one [40 days 17:22:59.029204 | START_REPLICATION 8E ] *and rest all seems normal . How come this START_REPLICATION running for 40 days and 17:30 Hrs Is this normal ?? * Is this due to pgbackrest or WAL replication *to local VM ? What may be the issue and how to resolve it ? Any hints much appreciated.. Please see the below pasted outputs for more information. Thank you, Krishane Any more tests I need to perform let me know, I can produce those information also. postgres=# SELECT pid, now() - query_start AS duration, query, state FROM pg_stat_activity WHERE state != 'idle' ORDER BY duration DESC LIMIT 10; pid | duration | query | state ---------+-------------------------+-------------------------------------------------------------------------------------------------------------------------------------------- ------------------+--------------------- 2653841 | 40 days 17:22:59.029204 | START_REPLICATION 8EF/40000000 TIMELINE 1 | active 3119645 | 00:00:00.454917 | select easi1_0.signal_id,easi1_0.action_list,easi1_0.additional_info,easi1_0.address,easi1_0.alloc_voip_extn,easi1_0.app_version,easi1_0.caller_name,easi1_0.caller_no,easi1 _0.close_remarks,easi1_0.count,easi1_0.device_info,easi1_0.district_code,easi1_0.emergency_signal,easi1_0.event_type,easi1_0.gender,easi1_0.gps_accuracy,easi1_0.imei_no,easi1_0.informed_officers,easi1_0.invoke _id,easi1_0.is_shout,easi1_0.last_update_time,easi1_0.latitude,easi1_0.longitude,easi1_0.place,easi1_0.ps_code,easi1_0.receive_time,easi1_0.es_signal_id,easi1_0.rescuer_count,easi1_0.service,easi1_0.signal_s tatus,easi1_0.signal_type,easi1_0.silent_communication,easi1_0.source_type,easi1_0.state_code,easi1_0.is_valid_gps from es_app.es_app_signal_info easi1_0 where easi1_0.caller_no=$1 and easi1_0.imei_no=$2 a nd easi1_0.source_type=$3 order by easi1_0.last_update_time desc fetch first $4 rows only | active ............................. ............................................. --0000000000004134b3064c416613 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
List,

I am experiencing heav= y load on my database cluster and DB server performance degrading over the = time.
vCPUs 16 ,=C2=A0 Mem 32 G=C2=A0 =C2=A0Swap : 8G=C2=A0 =C2= =A0storage 5T=C2=A0 =C2=A0 =C2=A0 =C2=A0RHEL 9.4=C2=A0 postgres 16


top - 11:55:18 up 175 days, =C2=A07:52, =C2=A03 = users, =C2=A0load average: 11.07, 10.05, 9.56
Tasks: 731 total, =C2=A014= running, 717 sleeping, =C2=A0 0 stopped, =C2=A0 0 zombie
%Cpu(s): 28.8 = us, =C2=A09.3 sy, =C2=A00.0 ni, 44.9 id, 13.7 wa, =C2=A00.8 hi, =C2=A02.5 s= i, =C2=A00.0 st
MiB Mem : =C2=A031837.6 total, =C2=A0 =C2=A0531.8 free, = =C2=A014773.3 used, =C2=A025392.0 buff/cache
MiB Swap: =C2=A0 8060.0 tot= al, =C2=A0 5140.4 free, =C2=A0 2919.6 used. =C2=A017064.2 avail Mem

= =C2=A0 =C2=A0 PID USER =C2=A0 =C2=A0 =C2=A0PR =C2=A0NI =C2=A0 =C2=A0VIRT = =C2=A0 =C2=A0RES =C2=A0 =C2=A0SHR S =C2=A0%CPU =C2=A0%MEM =C2=A0 =C2=A0 TIM= E+ COMMAND
3148533 postgre+ =C2=A020 =C2=A0 0 8973544 =C2=A0 7.3g =C2=A0= 7.3g S =C2=A032.9 =C2=A023.5 =C2=A0 0:22.52 postgres
3150012 postgre+ = =C2=A020 =C2=A0 0 8991380 =C2=A0 7.4g =C2=A0 7.4g S =C2=A032.9 =C2=A023.9 = =C2=A0 0:26.16 postgres
3081907 postgre+ =C2=A020 =C2=A0 0 9078400 =C2= =A0 8.0g =C2=A0 7.9g R =C2=A021.9 =C2=A025.8 =C2=A0 2:44.53 postgres
312= 5409 postgre+ =C2=A020 =C2=A0 0 9075568 =C2=A0 8.1g =C2=A0 8.0g S =C2=A021.= 3 =C2=A026.0 =C2=A0 2:34.63 postgres
3126500 postgre+ =C2=A020 =C2=A0 0 = 9073928 =C2=A0 8.0g =C2=A0 7.9g S =C2=A018.3 =C2=A025.7 =C2=A0 2:33.10 post= gres
3081925 postgre+ =C2=A020 =C2=A0 0 9059088 =C2=A0 8.2g =C2=A0 = 8.2g S =C2=A017.6 =C2=A026.5 =C2=A0 6:38.79 postgres=C2=A0


I have pgbackrest(2.52.1) running for incremental b= ackups to a remote reposerver=C2=A0and local=C2=A0 =C2=A0WAL replication co= nfigured to an onprem standalone=C2=A0 instance on another VM in the same l= ocal LAN.=C2=A0


archive_mode =3D on= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 # enables archiving; off,= on, or always
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 # (change re= quires restart)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 # (empty string ind= icates archive_command should
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 # be = used)
archive_command =3D 'pgbackrest --stanza=3DMy_Repo archive-pus= h %p'


To identify the resour= ce consuming queries I ran=C2=A0 and found=C2=A0 only one=C2=A0 [40 days 17= :22:59.029204 | START_REPLICATION 8E ]=C2=A0 and rest all seems normal = .=C2=A0 =C2=A0

=C2=A0How come this START_REPLICATI= ON running for 40 days and 17:30 Hrs Is this normal ??=C2=A0 =C2=A0 Is this due to pgbackrest=C2=A0 =C2=A0 or=C2=A0 =C2=A0WAL repl= ication to local VM ?

What may be the i= ssue and how to resolve it ?


Any hi= nts much appreciated..=C2=A0 Please see the below pasted outputs for more i= nformation.

Thank you,
Krishane



Any more tests I need to per= form let me know, I can produce those information also.


postgres=3D# SELECT pid, now() - query_start AS duration, query, state= FROM pg_stat_activity WHERE state !=3D 'idle' ORDER BY duration DE= SC LIMIT 10;

=C2=A0 =C2=A0pid =C2=A0 | =C2=A0 =C2=A0 =C2=A0 =C2=A0d= uration =C2=A0 =C2=A0 =C2=A0 =C2=A0 |

=C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0query

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 | =C2=A0 = =C2=A0 =C2=A0 =C2=A0state
---------+-------------------------+----------= ---------------------------------------------------------------------------= -------------------------------------------------------
----------------= --+---------------------
=C2=A02653841 | 40 days= 17:22:59.029204 | START_REPLICATION 8EF/40000000 TIMELINE 1

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 | active
=C2=A03119645 | 00:00:00.454917 =C2=A0 =C2=A0 =C2=A0= =C2=A0 | select easi1_0.signal_id,easi1_0.action_list,easi1_0.additional_i= nfo,easi1_0.address,easi1_0.alloc_voip_extn,easi1_0.app_version,easi1_0.cal= ler_name,easi1_0.caller_no,easi1
_0.close_remarks,easi1_0.count,easi1_0.= device_info,easi1_0.district_code,easi1_0.emergency_signal,easi1_0.event_ty= pe,easi1_0.gender,easi1_0.gps_accuracy,easi1_0.imei_no,easi1_0.informed_off= icers,easi1_0.invoke
_id,easi1_0.is_shout,easi1_0.last_update_time,easi1= _0.latitude,easi1_0.longitude,easi1_0.place,easi1_0.ps_code,easi1_0.receive= _time,easi1_0.es_signal_id,easi1_0.rescuer_count,easi1_0.service,easi1_0.si= gnal_s
tatus,easi1_0.signal_type,easi1_0.silent_communication,easi1_0.so= urce_type,easi1_0.state_code,easi1_0.is_valid_gps from es_app.es_app_signal= _info easi1_0 where easi1_0.caller_no=3D$1 and easi1_0.imei_no=3D$2 a
nd= easi1_0.source_type=3D$3 order by easi1_0.last_update_time desc fetch firs= t $4 rows only
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 | active
=C2=A0.............................
..........= ...................................




--0000000000004134b3064c416613--