Heavy load on DB Cluster

public inbox for [email protected]  
help / color / mirror / Atom feed

Heavy load on DB Cluster
2+ messages / 2 participants
[nested] [flat]

* Heavy load on DB Cluster
@ 2026-03-05 07:00  KK CHN <[email protected]>
  0 siblings, 1 reply; 2+ messages in thread

From: KK CHN @ 2026-03-05 07:00 UTC (permalink / raw)
  To: pgsql-general

List,

I am experiencing heavy load on my database cluster and DB server
performance degrading over the time.
vCPUs 16 ,  Mem 32 G   Swap : 8G   storage 5T       RHEL 9.4  postgres 16


top - 11:55:18 up 175 days,  7:52,  3 users,  load average: 11.07, 10.05,
9.56
Tasks: 731 total,  14 running, 717 sleeping,   0 stopped,   0 zombie
%Cpu(s): 28.8 us,  9.3 sy,  0.0 ni, 44.9 id, 13.7 wa,  0.8 hi,  2.5 si,
 0.0 st
MiB Mem :  31837.6 total,    531.8 free,  14773.3 used,  25392.0 buff/cache
MiB Swap:   8060.0 total,   5140.4 free,   2919.6 used.  17064.2 avail Mem

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+
COMMAND
3148533 postgre+  20   0 8973544   7.3g   7.3g S  32.9  23.5   0:22.52
postgres
3150012 postgre+  20   0 8991380   7.4g   7.4g S  32.9  23.9   0:26.16
postgres
3081907 postgre+  20   0 9078400   8.0g   7.9g R  21.9  25.8   2:44.53
postgres
3125409 postgre+  20   0 9075568   8.1g   8.0g S  21.3  26.0   2:34.63
postgres
3126500 postgre+  20   0 9073928   8.0g   7.9g S  18.3  25.7   2:33.10
postgres
3081925 postgre+  20   0 9059088   8.2g   8.2g S  17.6  26.5   6:38.79
postgres


I have pgbackrest(2.52.1) running for incremental backups to a remote
reposerver and local   WAL replication configured to an onprem standalone
instance on another VM in the same local LAN.


archive_mode = on               # enables archiving; off, on, or always
                                # (change requires restart)
                                # (empty string indicates archive_command
should
                                # be used)
archive_command = 'pgbackrest --stanza=My_Repo archive-push %p'


*To identify the resource consuming queries I ran  and found  only one  [40
days 17:22:59.029204 | START_REPLICATION 8E ]  *and rest all seems normal
.

 How come this START_REPLICATION running for 40 days and 17:30 Hrs Is this
normal ??   * Is this due to pgbackrest    or   WAL replication *to local
VM ?

What may be the issue and how to resolve it ?


Any hints much appreciated..  Please see the below pasted outputs for more
information.

Thank you,
Krishane



Any more tests I need to perform let me know, I can produce those
information also.


postgres=# SELECT pid, now() - query_start AS duration, query, state FROM
pg_stat_activity WHERE state != 'idle' ORDER BY duration DESC LIMIT 10;

   pid   |        duration         |


                                                     query


                  |        state
---------+-------------------------+--------------------------------------------------------------------------------------------------------------------------------------------
------------------+---------------------
 2653841 | 40 days 17:22:59.029204 | START_REPLICATION 8EF/40000000
TIMELINE 1

                  | active
 3119645 | 00:00:00.454917         | select
easi1_0.signal_id,easi1_0.action_list,easi1_0.additional_info,easi1_0.address,easi1_0.alloc_voip_extn,easi1_0.app_version,easi1_0.caller_name,easi1_0.caller_no,easi1
_0.close_remarks,easi1_0.count,easi1_0.device_info,easi1_0.district_code,easi1_0.emergency_signal,easi1_0.event_type,easi1_0.gender,easi1_0.gps_accuracy,easi1_0.imei_no,easi1_0.informed_officers,easi1_0.invoke
_id,easi1_0.is_shout,easi1_0.last_update_time,easi1_0.latitude,easi1_0.longitude,easi1_0.place,easi1_0.ps_code,easi1_0.receive_time,easi1_0.es_signal_id,easi1_0.rescuer_count,easi1_0.service,easi1_0.signal_s
tatus,easi1_0.signal_type,easi1_0.silent_communication,easi1_0.source_type,easi1_0.state_code,easi1_0.is_valid_gps
from es_app.es_app_signal_info easi1_0 where easi1_0.caller_no=$1 and
easi1_0.imei_no=$2 a
nd easi1_0.source_type=$3 order by easi1_0.last_update_time desc fetch
first $4 rows only
                  | active
 .............................
.............................................


^ permalink  raw  reply  [nested|flat] 2+ messages in thread

* Re: Heavy load on DB Cluster
@ 2026-03-05 08:18  Laurenz Albe <[email protected]>
  parent: KK CHN <[email protected]>
  0 siblings, 0 replies; 2+ messages in thread

From: Laurenz Albe @ 2026-03-05 08:18 UTC (permalink / raw)
  To: KK CHN <[email protected]>; pgsql-general

On Thu, 2026-03-05 at 12:30 +0530, KK CHN wrote:
> I am experiencing heavy load on my database cluster and DB server performance degrading over the time.
> vCPUs 16 ,  Mem 32 G   Swap : 8G   storage 5T       RHEL 9.4  postgres 16
> 
> 
> top - 11:55:18 up 175 days,  7:52,  3 users,  load average: 11.07, 10.05, 9.56
> Tasks: 731 total,  14 running, 717 sleeping,   0 stopped,   0 zombie
> %Cpu(s): 28.8 us,  9.3 sy,  0.0 ni, 44.9 id, 13.7 wa,  0.8 hi,  2.5 si,  0.0 st
> MiB Mem :  31837.6 total,    531.8 free,  14773.3 used,  25392.0 buff/cache
> MiB Swap:   8060.0 total,   5140.4 free,   2919.6 used.  17064.2 avail Mem
> 
>     PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
> 3148533 postgre+  20   0 8973544   7.3g   7.3g S  32.9  23.5   0:22.52 postgres
> 3150012 postgre+  20   0 8991380   7.4g   7.4g S  32.9  23.9   0:26.16 postgres
> 3081907 postgre+  20   0 9078400   8.0g   7.9g R  21.9  25.8   2:44.53 postgres
> 3125409 postgre+  20   0 9075568   8.1g   8.0g S  21.3  26.0   2:34.63 postgres
> 3126500 postgre+  20   0 9073928   8.0g   7.9g S  18.3  25.7   2:33.10 postgres
> 3081925 postgre+  20   0 9059088   8.2g   8.2g S  17.6  26.5   6:38.79 postgres 
> 
> I have pgbackrest(2.52.1) running for incremental backups to a remote reposerver
> and local   WAL replication configured to an onprem standalone  instance on
> another VM in the same local LAN. 
> 
> archive_mode = on               # enables archiving; off, on, or always
>                                 # (change requires restart)
>                                 # (empty string indicates archive_command should
>                                 # be used)
> archive_command = 'pgbackrest --stanza=My_Repo archive-push %p'
> 
> To identify the resource consuming queries I ran  and found  only one
> [40 days 17:22:59.029204 | START_REPLICATION 8E ]  and rest all seems normal .   
> 
> How come this START_REPLICATION running for 40 days and 17:30 Hrs Is this normal ??
> Is this due to pgbackrest    or   WAL replication to local VM ?

That is an active replication - or pg_receivewal, which amounts to the same.
That's not really a query; the standby is streaming WAL from the primary and
has been doing that for over 40 days.  Nothing to worry about.

> What may be the issue and how to resolve it ?

For that, configure pg_stat_statements and use it to find your most time-consuming
statements.

Yours,
Laurenz Albe






^ permalink  raw  reply  [nested|flat] 2+ messages in thread

end of thread, other threads:[~2026-03-05 08:18 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2026-03-05 07:00 Heavy load on DB Cluster KK CHN <[email protected]>
2026-03-05 08:18 ` Laurenz Albe <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox