Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dZwIf-0006Ya-Vt for pgsql-performance@arkaria.postgresql.org; Tue, 25 Jul 2017 09:38:46 +0000 Received: from localhost ([127.0.0.1] helo=postgresql.org) by malur.postgresql.org with smtp (Exim 4.84_2) (envelope-from ) id 1dZwIf-00069b-Ci for pgsql-performance@arkaria.postgresql.org; Tue, 25 Jul 2017 09:38:45 +0000 Received: from makus.postgresql.org ([2001:4800:1501:1::229]) by malur.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1dZwGp-0002l9-GT for pgsql-performance@postgresql.org; Tue, 25 Jul 2017 09:36:51 +0000 Received: from mail-qk0-x22c.google.com ([2607:f8b0:400d:c09::22c]) by makus.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.84_2) (envelope-from ) id 1dZwGm-00017r-2E for pgsql-performance@postgresql.org; Tue, 25 Jul 2017 09:36:50 +0000 Received: by mail-qk0-x22c.google.com with SMTP id k2so27330109qkf.0 for ; Tue, 25 Jul 2017 02:36:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=dfwN97wPUqRNc8ttdnlgNd7RWJD1lQbw3R2GhrxtlAo=; b=DJN7y0Z89Mk7y7ZvnwMD6qbYuOOvVOMnbZrFjfg+BthjeLOycn8LVlAYE3JfOxxtvG /8Lrq7hib3q7nKbBvZt8KXsD2CLulJ0N/nEHKYRAKDPsujhUVDq84vrdKWM2IW5Pijlr nvss+M082Sxg2ObcqWdxiIeLSj+HaXfqGQRO/GIpJkeWwpcJyBocIikxL+eNyB0TbVPB l8ArD2/qZDfGCZuVQOvmvLrDG1nHi/la3yr4TJB/4ZTDjJDH22zS9FRS2LRvJB0R3DeZ pCmL870+yO1IN/IZq+hdd83/yAJ6TR2ZKmjt8L+R/QQj+bHH0cM4rUIK+dN+5E8QyK94 5T8g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=dfwN97wPUqRNc8ttdnlgNd7RWJD1lQbw3R2GhrxtlAo=; b=Zdypy0jFbRNnrsUjxAQ3WwsSRu2Gi83kGJ4Q4zZexan4YctjuJ/AVVytBeq2pouqTa LDu2ZVE43gBMcbutqR2X9a+v+PLmIUh37JEEIlqzOoMpIshaG69lhct2RKpPb2CDrpGN oZ10lBmpiQv+Xx4bzvppTlxfF6PGT+U29zziVasjwHeBHIM0VaxcLrdRgiNwyR3KlMpl +oLxWGTNNzmgqt8+NN6nNxUExujGgXzPPeBoPH2yLOkivmn5RtJDvb0tNiPMLWrrSwo5 B+ud8kQnBvL74OlR3W4VQFjaj184yjDhd0Bz94OLre50kPQbs6Ura719W9pf4XvuLoka cJow== X-Gm-Message-State: AIVw111fqgLKTy7A41b2cOBdRaq0GDHMijINYgkEKevCrp0ucz/r1i7h p0sCmg8k1H5E/LOUN8llPo1brpw3yG+9 X-Received: by 10.55.9.140 with SMTP id 134mr22555508qkj.355.1500975406470; Tue, 25 Jul 2017 02:36:46 -0700 (PDT) MIME-Version: 1.0 Received: by 10.12.157.226 with HTTP; Tue, 25 Jul 2017 02:36:25 -0700 (PDT) In-Reply-To: References: From: Charles Nadeau Date: Tue, 25 Jul 2017 11:36:25 +0200 Message-ID: Subject: Re: Very poor read performance, query independent To: "pgsql-performa." Content-Type: multipart/alternative; boundary="001a11488dbceb5a4205552111c2" List-Archive: List-Help: List-ID: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: X-Mailing-List: pgsql-performance Precedence: bulk Sender: pgsql-performance-owner@postgresql.org --001a11488dbceb5a4205552111c2 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable All, Here is a list of what I did based of the suggestions made after my initial post: *Reduce max_parallel_workers to 4: Values higher makes the workers wait for data as the RAID0 array can't deliver high enough IOPS. *Reduce random_page_cost to 1: Forcing the use of index makes queries faster despite low random throughput. *Increase shared_buffer to 66GB and effective_cache_size to 53GB: With the new server having 144GB of RAM, increasing shared_buffer allows Postgresql to keep a lot of data in memory reducing the need to go to disk. *Reduce min_parallel_relation_size to 512kB to have more workers when doing sequential parallel scan *Increased the /sys/block/sd[ac]/queue/read_ahead_kb to 16384 for my arrays using HDD *Reused old SSDs (that are compatible with my RAID controller, to my surprise) to put my most used index and tables. Thanks to everybody who made suggestions. I now know more about Postgresql tuning. Charles On Mon, Jul 10, 2017 at 4:03 PM, Charles Nadeau wrote: > I=E2=80=99m running PostgreSQL 9.6.3 on Ubuntu 16.10 (kernel 4.4.0-85-gen= eric). > Hardware is: > > *2x Intel Xeon E5550 > > *72GB RAM > > *Hardware RAID10 (4 x 146GB SAS 10k) P410i controller with 1GB FBWC (80% > read/20% write) for Postgresql data only: > > Logical Drive: 3 > > Size: 273.4 GB > > Fault Tolerance: 1+0 > > Heads: 255 > > Sectors Per Track: 32 > > Cylinders: 65535 > > Strip Size: 128 KB > > Full Stripe Size: 256 KB > > Status: OK > > Caching: Enabled > > Unique Identifier: 600508B1001037383941424344450A00 > > Disk Name: /dev/sdc > > Mount Points: /mnt/data 273.4 GB > > OS Status: LOCKED > > Logical Drive Label: A00A194750123456789ABCDE516F > > Mirror Group 0: > > physicaldrive 2I:1:5 (port 2I:box 1:bay 5, SAS, 146 GB, OK) > > physicaldrive 2I:1:6 (port 2I:box 1:bay 6, SAS, 146 GB, OK) > > Mirror Group 1: > > physicaldrive 2I:1:7 (port 2I:box 1:bay 7, SAS, 146 GB, OK) > > physicaldrive 2I:1:8 (port 2I:box 1:bay 8, SAS, 146 GB, OK) > > Drive Type: Data > > Formatted with ext4 with: sudo mkfs.ext4 -E stride=3D32,stripe_width=3D64= -v > /dev/sdc1. > > Mounted in /etc/fstab with this line: "UUID=3D99fef4ae-51dc-4365-9210-0b1= 53b1cbbd0 > /mnt/data ext4 rw,nodiratime,user_xattr,noatime,nobarrier,errors=3Dremoun= t-ro > 0 1" > > Postgresql is the only application running on this server. > > > Postgresql is used as a mini data warehouse to generate reports and do > statistical analysis. It is used by at most 2 users and fresh data is add= ed > every 10 days. The database has 16 tables: one is 224GB big and the rest > are between 16kB and 470MB big. > > > My configuration is: > > > name | current_setting | source > > ---------------------------------+-------------------------- > ----------------------+---------------------- > > application_name | psql | client > > autovacuum_vacuum_scale_factor | 0 | configuration file > > autovacuum_vacuum_threshold | 2000 | configuration file > > checkpoint_completion_target | 0.9 | configuration file > > checkpoint_timeout | 30min | configuration file > > client_encoding | UTF8 | client > > client_min_messages | log | configuration file > > cluster_name | 9.6/main | configuration file > > cpu_index_tuple_cost | 0.001 | configuration file > > cpu_operator_cost | 0.0005 | configuration file > > cpu_tuple_cost | 0.003 | configuration file > > DateStyle | ISO, YMD | configuration file > > default_statistics_target | 100 | configuration file > > default_text_search_config | pg_catalog.english | configuration file > > dynamic_shared_memory_type | posix | configuration file > > effective_cache_size | 22GB | configuration file > > effective_io_concurrency | 4 | configuration file > > external_pid_file | /var/run/postgresql/9.6-main.pid | configuration file > > lc_messages | C | configuration file > > lc_monetary | en_CA.UTF-8 | configuration file > > lc_numeric | en_CA.UTF-8 | configuration file > > lc_time | en_CA.UTF-8 | configuration file > > listen_addresses | * | configuration file > > lock_timeout | 100s | configuration file > > log_autovacuum_min_duration | 0 | configuration file > > log_checkpoints | on | configuration file > > log_connections | on | configuration file > > log_destination | csvlog | configuration file > > log_directory | /mnt/bigzilla/data/toburn/hp/postgresql/pg_log | > configuration file > > log_disconnections | on | configuration file > > log_error_verbosity | default | configuration file > > log_file_mode | 0600 | configuration file > > log_filename | postgresql-%Y-%m-%d_%H%M%S.log | configuration file > > log_line_prefix | user=3D%u,db=3D%d,app=3D%aclient=3D%h | configuration f= ile > > log_lock_waits | on | configuration file > > log_min_duration_statement | 0 | configuration file > > log_min_error_statement | debug1 | configuration file > > log_min_messages | debug1 | configuration file > > log_rotation_size | 1GB | configuration file > > log_temp_files | 0 | configuration file > > log_timezone | localtime | configuration file > > logging_collector | on | configuration file > > maintenance_work_mem | 3GB | configuration file > > max_connections | 10 | configuration file > > max_locks_per_transaction | 256 | configuration file > > max_parallel_workers_per_gather | 14 | configuration file > > max_stack_depth | 2MB | environment variable > > max_wal_size | 4GB | configuration file > > max_worker_processes | 14 | configuration file > > min_wal_size | 2GB | configuration file > > parallel_setup_cost | 1000 | configuration file > > parallel_tuple_cost | 0.012 | configuration file > > port | 5432 | configuration file > > random_page_cost | 22 | configuration file > > seq_page_cost | 1 | configuration file > > shared_buffers | 34GB | configuration file > > shared_preload_libraries | pg_stat_statements | configuration file > > ssl | on | configuration file > > ssl_cert_file | /etc/ssl/certs/ssl-cert-snakeoil.pem | configuration file > > ssl_key_file | /etc/ssl/private/ssl-cert-snakeoil.key | configuration fil= e > > statement_timeout | 1000000s | configuration file > > stats_temp_directory | /var/run/postgresql/9.6-main.pg_stat_tmp | > configuration file > > superuser_reserved_connections | 1 | configuration file > > syslog_facility | local1 | configuration file > > syslog_ident | postgres | configuration file > > syslog_sequence_numbers | on | configuration file > > temp_file_limit | 80GB | configuration file > > TimeZone | localtime | configuration file > > track_activities | on | configuration file > > track_counts | on | configuration file > > track_functions | all | configuration file > > unix_socket_directories | /var/run/postgresql | configuration file > > vacuum_cost_delay | 1ms | configuration file > > vacuum_cost_limit | 5000 | configuration file > > vacuum_cost_page_dirty | 200 | configuration file > > vacuum_cost_page_hit | 10 | configuration file > > vacuum_cost_page_miss | 100 | configuration file > > wal_buffers | 16MB | configuration file > > wal_compression | on | configuration file > > wal_sync_method | fdatasync | configuration file > > work_mem | 1468006kB | configuration file > > > The part of /etc/sysctl.conf I modified is: > > vm.swappiness =3D 1 > > vm.dirty_background_bytes =3D 134217728 > > vm.dirty_bytes =3D 1073741824 > > vm.overcommit_ratio =3D 100 > > vm.zone_reclaim_mode =3D 0 > > kernel.numa_balancing =3D 0 > > kernel.sched_autogroup_enabled =3D 0 > > kernel.sched_migration_cost_ns =3D 5000000 > > > The problem I have is very poor read. When I benchmark my array with fio = I > get random reads of about 200MB/s and 1100IOPS and sequential reads of > about 286MB/s and 21000IPS. But when I watch my queries using pg_activity= , > I get at best 4MB/s. Also using dstat I can see that iowait time is at > about 25%. This problem is not query-dependent. > > I backed up the database, I reformated the array making sure it is well > aligned then restored the database and got the same result. > > Where should I target my troubleshooting at this stage? I reformatted my > drive, I tuned my postgresql.conf and OS as much as I could. The hardware > doesn=E2=80=99t seem to have any issues, I am really puzzled. > > Thanks! > > > Charles > > -- > Charles Nadeau Ph.D. > --=20 Charles Nadeau Ph.D. http://charlesnadeau.blogspot.com/ --001a11488dbceb5a4205552111c2 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
All,

Here is a list of what = I did based of the suggestions made after my initial post:
*Reduc= e max_parallel_workers to 4: Values higher makes the workers wait for data = as the RAID0 array can't deliver high enough IOPS.
*Reduce ra= ndom_page_cost to 1: Forcing the use of index makes queries faster despite = low random throughput.
*Increase shared_buffer to 66GB and effect= ive_cache_size to 53GB: With the new server having 144GB of RAM, increasing= shared_buffer allows Postgresql to keep a lot of data in memory reducing t= he need to go to disk.
*Reduce min_parallel_relation_size to 512k= B to have more workers when doing sequential parallel scan
*Incre= ased the /sys/block/sd[ac]/queue/read_ahead_kb to 16384 for my arrays using= HDD
*Reused old SSDs (that are compatible with my RAID controlle= r, to my surprise) to put my most used index and tables.

Thanks to everybody who made suggestions. I now know more about Post= gresql tuning.

Charles

On Mon, Jul 10, 2017 at 4:03 PM, = Charles Nadeau <charles.nadeau@gmail.com> wrote:
<= blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px= #ccc solid;padding-left:1ex">
=09 =09 =09

I=E2=80=99m running PostgreSQL 9.6.3 on Ubuntu 16.10 (kernel 4.4.0-85-generic). Hardware is:

*2x Intel Xeon E5550

*72GB RAM

*Hardware RAID10 (4 x 146GB SAS 10k) P410i controller with 1GB FBWC (80% read/20% write) for Postgresql data only:

Logical Drive: 3

Size: 273.4 GB

Fault Tolerance: 1+0

Heads: 255

Sectors Per Track: 32

Cylinders: 65535

Strip Size: 128 KB

Full Stripe Size: 256 KB

Status: OK

Caching:=20 Enabled

Unique Identifier: 600508B1001037383941424344450A00

Disk Name: /dev/sdc

Mount Points: /mnt/data 273.4 GB

OS Status: LOCKED

Logical Drive Label: A00A194750123456789ABCDE516F

Mirror Group 0:

=20 physicaldrive 2I:1:5 (port 2I:box 1:bay 5, SAS, 146 GB, OK)

=20 physicaldrive 2I:1:6 (port 2I:box 1:bay 6, SAS, 146 GB, OK)

Mirror Group 1:

=20 physicaldrive 2I:1:7 (port 2I:box 1:bay 7, SAS, 146 GB, OK)

=20 physicaldrive 2I:1:8 (port 2I:box 1:bay 8, SAS, 146 GB, OK)

Drive Type: Data

Formatted with ext4 with: sudo mkfs.ext4 -E stride=3D32,stripe_width=3D64 -v /dev/sdc1.

Mounted in /etc/fstab with this line: "UUID=3D99fef4ae-51dc-4365-9210-0b153b= 1cbbd0 /mnt/data ext4 rw,nodiratime,user_xattr,noatime,nobarrier,errors=3Dremount-ro 0 = 1"

Postgresql is the only application running on this server.


Postgresql is used as a mini data warehouse to generate reports and do statistical analysis. It is used by at most 2 users and fresh data is added every 10 days. The database has 16 tables: one is 224GB big and the rest are between 16kB and 470MB big.


My configuration is:


name =20 | current_setting | =20 source =20

---------------------------= ------+------------------------------------------------+----------------------

application_name =20 | psql | client

autovacuum_vacuum_scale_factor | 0 =20 | configuration file

autovacuum_vacuum_threshold | 2000 =20 | configuration file

checkpoint_completion_target | 0.9 =20 | configuration file

checkpoint_timeout=20 | 30min | configuration file

client_encoding =20 | UTF8 | client

client_min_messages | log | configuration file

cluster_name =20 | 9.6/main | configuration file

cpu_index_tuple_cost | 0.001 =20 | configuration file

cpu_operator_cost =20 | 0.0005 | configuration file

cpu_tuple_cost =20 | 0.003 | configuration file

DateStyle =20 | ISO, YMD | configuration file

default_statistics_target | 100 =20 | configuration file

default_text_search_config | pg_catalog.english =20 | configuration file

dynamic_shared_memory_type | posix =20 | configuration file

effective_cache_size | 22GB =20 | configuration file

effective_io_concurrency | 4 =20 | configuration file

external_pid_file =20 | /var/run/postgresql/9.6-main.pid | configuration file

lc_messages =20 | C | configuration file

lc_monetary =20 | en_CA.UTF-8 | configuration file

lc_numeric =20 | en_CA.UTF-8 | configuration file

lc_time =20 | en_CA.UTF-8 | configuration file

listen_addresses =20 | * | configuration file

lock_timeout =20 | 100s | configuration file

log_autovacuum_min_duration | 0 =20 | configuration file

log_checkpoints =20 | on | configuration file

log_connections =20 | on | configuration file

log_destination =20 | csvlog | configuration file

log_directory =20 | /mnt/bigzilla/data/toburn/hp/postgresql/pg_log | configuration file

log_disconnections=20 | on | configuration file

log_error_verbosity | default | configuration file

log_file_mode =20 | 0600 | configuration file

log_filename =20 | postgresql-%Y-%m-%d_%H%M%S.log | configuration file

log_line_prefix =20 | user=3D%u,db=3D%d,app=3D%aclient=3D%h | configuration file

log_lock_waits =20 | on | configuration file

log_min_duration_statement | 0 =20 | configuration file

log_min_error_statement | debug1 =20 | configuration file

log_min_messages =20 | debug1 | configuration file

log_rotation_size =20 | 1GB | configuration file

log_temp_files =20 | 0 | configuration file

log_timezone =20 | localtime | configuration file

logging_collector =20 | on | configuration file

maintenance_work_mem | 3GB =20 | configuration file

max_connections =20 | 10 | configuration file

max_locks_per_transaction | 256 =20 | configuration file

max_parallel_workers_per_gather | 14 = =20 | configuration file

max_stack_depth =20 | 2MB | environment variable

max_wal_size =20 | 4GB | configuration file

max_worker_processes | 14 =20 | configuration file

min_wal_size =20 | 2GB | configuration file

parallel_setup_cost | 1000 | configuration file

parallel_tuple_cost | 0.012 | configuration file

port =20 | 5432 | configuration file

random_page_cost =20 | 22 | configuration file

seq_page_cost =20 | 1 | configuration file

shared_buffers =20 | 34GB | configuration file

shared_preload_libraries | pg_stat_statements =20 | configuration file

ssl =20 | on | configuration file

ssl_cert_file =20 | /etc/ssl/certs/ssl-cert-snakeoil.pem | configuration file

ssl_key_file =20 | /etc/ssl/private/ssl-cert-snakeoil.key | configuration file

statement_timeout =20 | 1000000s | configuration file

stats_temp_directory | /var/run/postgresql/9.6-main.pg_stat_tmp | configuration file

superuser_reserved_connections | 1 =20 | configuration file

syslog_facility =20 | local1 | configuration file

syslog_ident =20 | postgres | configuration file

syslog_sequence_numbers | on =20 | configuration file

temp_file_limit =20 | 80GB | configuration file

TimeZone =20 | localtime | configuration file

track_activities =20 | on | configuration file

track_counts =20 | on | configuration file

track_functions =20 | all | configuration file

unix_socket_directories | /var/run/postgresql =20 | configuration file

vacuum_cost_delay =20 | 1ms | configuration file

vacuum_cost_limit =20 | 5000 | configuration file

vacuum_cost_page_dirty | 200 =20 | configuration file

vacuum_cost_page_hit | 10 =20 | configuration file

vacuum_cost_page_miss | 100 =20 | configuration file

wal_buffers =20 | 16MB | configuration file

wal_compression =20 | on | configuration file

wal_sync_method =20 | fdatasync | configuration file

work_mem =20 | 1468006kB | configuration file


The part of /etc/sysctl.conf I modified is:

vm.swappiness =3D 1

vm.dirty_background_bytes =3D 134217728

vm.dirty_bytes =3D 1073741824

vm.overcommit_ratio =3D 100

vm.zone_reclaim_mode =3D 0

kernel.numa_balancing =3D 0

kernel.sched_autogroup_enab= led =3D 0

kernel.sched_migration_cost= _ns =3D 5000000


The problem I have is very poor read. When I benchmark my array with fio I get random reads of about 200MB/s and 1100IOPS and sequential reads of about 286MB/s and 21000IPS. But when I watch my queries using pg_activity, I get at best 4MB/s. Also using dstat I can see that iowait time is at about 25%. This problem is not query-dependent.

I backed up the database, I reformated the array making sure it is well aligned then restored the database and got the same result.

Where should I target my troubleshooting at this stage? I reformatted my drive, I tuned my postgresql.conf and OS as much as I could. The hardware doesn=E2=80=99t seem to have any issues, I am really puzzled.

Thanks!


Charles


-= -
Charles Nadeau P= h.D.



--
--001a11488dbceb5a4205552111c2--