public inbox for [email protected]  
help / color / mirror / Atom feed
BDR, wal sender, high system cpu, mutex_lock_common
2+ messages / 1 participants
[nested] [flat]

* BDR, wal sender, high system cpu, mutex_lock_common
@ 2017-09-29 23:07 milist ujang <[email protected]>
  2017-10-01 01:36 ` Re: BDR, wal sender, high system cpu, mutex_lock_common milist ujang <[email protected]>
  0 siblings, 1 reply; 2+ messages in thread

From: milist ujang @ 2017-09-29 23:07 UTC (permalink / raw)
  To: pgsql-performance

Hi all,


I've an environment 9.4 + bdr:
PostgreSQL 9.4.4 on x86_64-unknown-linux-gnu, compiled by gcc (Debian
4.7.2-5) 4.7.2, 64-bit

kernel version:
3.2.0-4-amd64 #1 SMP Debian 3.2.65-1 x86_64 GNU/Linux

This is consolidation databases, in this machine there are around 250+ wal
sender processes.

top output revealed high system cpu:
%Cpu(s):  1.4 us, 49.7 sy,  0.0 ni, 48.8 id,  0.0 wa,  0.0 hi,  0.0 si,
 0.0 st

profiling cpu with perf:

perf top -e cpu-clock

Events: 142K cpu-clock
 82.37%  [kernel]            [k] __mutex_lock_common.isra.5
  4.49%  [kernel]            [k] do_raw_spin_lock
  2.23%  [kernel]            [k] mutex_lock
  2.16%  [kernel]            [k] mutex_unlock
  2.12%  [kernel]            [k] arch_local_irq_restore
  1.73%  postgres            [.] ValidXLogRecord
  0.87%  [kernel]            [k] __mutex_unlock_slowpath
  0.78%  [kernel]            [k] arch_local_irq_enable
  0.63%  [kernel]            [k] sys_recvfrom


finally get which processes (wal senders) that are using mutexes:

perf top -e task-clock -p 55382

Events: 697  task-clock
 88.08%  [kernel]  [k] __mutex_lock_common.isra.5
  3.27%  [kernel]  [k] do_raw_spin_lock
  2.34%  [kernel]  [k] arch_local_irq_restore
  2.10%  postgres  [.] ValidXLogRecord
  1.87%  [kernel]  [k] mutex_unlock
  1.87%  [kernel]  [k] mutex_lock
  0.47%  [kernel]  [k] sys_recvfrom

I think bdr is only reading wal file (current state is we behind current
wal lsn),
so why reading wal file needs mutex?

I wonder, is there kernel version has better handling mutexes?


-- 
regards

ujang jaenudin | DBA Consultant (Freelancer)
http://ora62.wordpress.com
http://id.linkedin.com/pub/ujang-jaenudin/12/64/bab


^ permalink  raw  reply  [nested|flat] 2+ messages in thread

* Re: BDR, wal sender, high system cpu, mutex_lock_common
  2017-09-29 23:07 BDR, wal sender, high system cpu, mutex_lock_common milist ujang <[email protected]>
@ 2017-10-01 01:36 ` milist ujang <[email protected]>
  0 siblings, 0 replies; 2+ messages in thread

From: milist ujang @ 2017-10-01 01:36 UTC (permalink / raw)
  To: pgsql-performance

additional info, strace output :

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 98.30    1.030072           5    213063    201463 read
  1.69    0.017686           0    201464    201464 recvfrom
  0.01    0.000110           0       806           lseek
  0.00    0.000043           0       474       468 rt_sigreturn
  0.00    0.000000           0         6           open
  0.00    0.000000           0         6           close
------ ----------- ----------- --------- --------- ----------------
100.00    1.047911                415819    403395 total



On Sat, Sep 30, 2017 at 6:07 AM, milist ujang <[email protected]>
wrote:

> Hi all,
>
>
> I've an environment 9.4 + bdr:
> PostgreSQL 9.4.4 on x86_64-unknown-linux-gnu, compiled by gcc (Debian
> 4.7.2-5) 4.7.2, 64-bit
>
> kernel version:
> 3.2.0-4-amd64 #1 SMP Debian 3.2.65-1 x86_64 GNU/Linux
>
> This is consolidation databases, in this machine there are around 250+ wal
> sender processes.
>
> top output revealed high system cpu:
> %Cpu(s):  1.4 us, 49.7 sy,  0.0 ni, 48.8 id,  0.0 wa,  0.0 hi,  0.0 si,
>  0.0 st
>
> profiling cpu with perf:
>
> perf top -e cpu-clock
>
> Events: 142K cpu-clock
>  82.37%  [kernel]            [k] __mutex_lock_common.isra.5
>   4.49%  [kernel]            [k] do_raw_spin_lock
>   2.23%  [kernel]            [k] mutex_lock
>   2.16%  [kernel]            [k] mutex_unlock
>   2.12%  [kernel]            [k] arch_local_irq_restore
>   1.73%  postgres            [.] ValidXLogRecord
>   0.87%  [kernel]            [k] __mutex_unlock_slowpath
>   0.78%  [kernel]            [k] arch_local_irq_enable
>   0.63%  [kernel]            [k] sys_recvfrom
>
>
> finally get which processes (wal senders) that are using mutexes:
>
> perf top -e task-clock -p 55382
>
> Events: 697  task-clock
>  88.08%  [kernel]  [k] __mutex_lock_common.isra.5
>   3.27%  [kernel]  [k] do_raw_spin_lock
>   2.34%  [kernel]  [k] arch_local_irq_restore
>   2.10%  postgres  [.] ValidXLogRecord
>   1.87%  [kernel]  [k] mutex_unlock
>   1.87%  [kernel]  [k] mutex_lock
>   0.47%  [kernel]  [k] sys_recvfrom
>
> I think bdr is only reading wal file (current state is we behind current
> wal lsn),
> so why reading wal file needs mutex?
>
> I wonder, is there kernel version has better handling mutexes?
>
>
> --
> regards
>
> ujang jaenudin | DBA Consultant (Freelancer)
> http://ora62.wordpress.com
> http://id.linkedin.com/pub/ujang-jaenudin/12/64/bab
>



-- 
regards

ujang jaenudin | DBA Consultant (Freelancer)
http://ora62.wordpress.com
http://id.linkedin.com/pub/ujang-jaenudin/12/64/bab


^ permalink  raw  reply  [nested|flat] 2+ messages in thread


end of thread, other threads:[~2017-10-01 01:36 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2017-09-29 23:07 BDR, wal sender, high system cpu, mutex_lock_common milist ujang <[email protected]>
2017-10-01 01:36 ` milist ujang <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox