public inbox for [email protected]  
help / color / mirror / Atom feed
From: ravi k <[email protected]>
To: Tomas Vondra <[email protected]>
Cc: Álvaro Herrera <[email protected]>
Cc: Ramakrishna m <[email protected]>
Cc: pgsql-general <[email protected]>
Subject: Re: Commit Latency
Date: Sat, 8 Feb 2025 13:46:44 +0530
Message-ID: <CAFL4M8EBX4u0nTKAe4r2LHyyuGKELALcOYs3fPQ6tQdeSnw-Rg@mail.gmail.com> (raw)
In-Reply-To: <[email protected]>
References: <CAG-eXHLT2Fgw_szi3KUd8a2AE7cgzBsdi3MaUaGb=TQ+w7oV4w@mail.gmail.com>
	<[email protected]>
	<CAFL4M8FrJqr_VZ4pMu-xNTYA_sg8W=vvrq_d-45zd9nWH7wX1g@mail.gmail.com>
	<[email protected]>

Please find the detail below:

Our system is of 4TB(SSD shared pure storage) and *xfs *filesystem:

   - Also handling 300 TPS (~3500 DML queries)
      - 80% of modified data is BLOB (text)
      - Table sizes generally under 70GB
      - Unusual naptime configuration due to data characteristics
      - Idle in TX expected regularly ~1ms
      - Action items are in pipeline:
      a. Adjusting vacuum parameters
      b. Planning manual vacuum jobs during off-hours
      - Note: App team is working on refactoring, but this will take time

Pg_stat_activity o/p:
state        | count
---------------------+-------
                     |     5
 active              |     9
 idle in transaction |     2
 idle                |   212
(4 rows)

Laurenz, thanks for the article.


Alvaro, regarding pg_wait_sampling, we are unable to enable it at this time
due to the requirement for a server restart.


Tomas, please find the fsync log attached below for your review.Let me know
if you need any further information or clarification.



5 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.

Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync is Linux's default)
        open_datasync                      3530.460 ops/sec     283 usecs/op
        fdatasync                          3477.850 ops/sec     288 usecs/op
        fsync                              2990.668 ops/sec     334 usecs/op
        fsync_writethrough                              n/a
        open_sync                          3287.932 ops/sec     304 usecs/op

Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync is Linux's default)
        open_datasync                      1752.086 ops/sec     571 usecs/op
        fdatasync                          3198.132 ops/sec     313 usecs/op
        fsync                              2980.334 ops/sec     336 usecs/op
        fsync_writethrough                              n/a
        open_sync                          1555.356 ops/sec     643 usecs/op

Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB in different write
open_sync sizes.)
         1 * 16kB open_sync write          2898.926 ops/sec     345 usecs/op
         2 *  8kB open_sync writes         1631.841 ops/sec     613 usecs/op
         4 *  4kB open_sync writes          861.491 ops/sec    1161 usecs/op
         8 *  2kB open_sync writes          451.548 ops/sec    2215 usecs/op
        16 *  1kB open_sync writes          233.062 ops/sec    4291 usecs/op

Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written on a different
descriptor.)
        write, fsync, close                2747.990 ops/sec     364 usecs/op
        write, close, fsync                2804.104 ops/sec     357 usecs/op

Non-sync'ed 8kB writes:
        write                            386932.826 ops/sec       3 usecs/op






*We suspect issues with wal_init_zero due to a 3-minute log commit spike
coinciding with increased WAL creation time, as shown in the attached
graphs.*

*Note:  Size of the wal file is 1GB and generates wal of 3GB/min*
            active | IO: *WALInitWrite    *| 00:00:00.249311 |      COMMIT
       IO | *WALInitSync   *| active  474.163 ms| COMMIT

Commit spike timings:

[image: image.png]

Wal creation_timings:

[image: image.png]

Best,
Ravi

On Fri, Feb 7, 2025 at 10:14 PM Tomas Vondra <[email protected]> wrote:

> On 2/7/25 14:00, ravi k wrote:
> > Thanks for the suggestions!
> >
> > It looks the issue is happening at the time of wal creation, does
> > wal_init_zero off is good option?
> >
>
> It's impossible to say if that option will help, considering how little
> info about the system you provided. I suggest you start by answering the
> questions you've been asked by me & Alvaro, explain why you concluded
> it's related to WAL creation, etc.
>
>
> regards
>
> --
> Tomas Vondra
>
>


Attachments:

  [image/png] image.png (57.9K, 3-image.png)
  download | view image

  [image/png] image.png (42.7K, 4-image.png)
  download | view image

view thread (7+ messages)

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Commit Latency
  In-Reply-To: <CAFL4M8EBX4u0nTKAe4r2LHyyuGKELALcOYs3fPQ6tQdeSnw-Rg@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox