Re: Increase default maintenance_io_concurrency to 16

public inbox for [email protected]  
help / color / mirror / Atom feed

From: Gregory Smith <[email protected]>
To: Andres Freund <[email protected]>
Cc: Bruce Momjian <[email protected]>
Cc: Melanie Plageman <[email protected]>
Cc: PostgreSQL-development <[email protected]>
Cc: Greg Smith <[email protected]>
Subject: Re: Increase default maintenance_io_concurrency to 16
Date: Tue, 18 Mar 2025 19:52:04 -0400
Message-ID: <CAHLJuCVJkfNozHikJYGUe+xHnPGL-+YB7RWsuC0c_XR0ytrnQg@mail.gmail.com> (raw)
In-Reply-To: <4p7gtb2nfr3njhgq7bmpe24unsbyoerlom7zrcu5sl2vyyutlp@ol5ywrm7j5ok>
References: <[email protected]>
	<[email protected]>
	<fycuaxeisofdyv4265ejiluixuklmoaaoywa7nhz3heytf4na6@3vd6owgwvadc>
	<[email protected]>
	<rdd4m4fze6zt4fjmp2p2ez5smokglno7pqefloyiql36kkudw6@pwpmabadygqz>
	<[email protected]>
	<4p7gtb2nfr3njhgq7bmpe24unsbyoerlom7zrcu5sl2vyyutlp@ol5ywrm7j5ok>

On Tue, Mar 18, 2025 at 5:04 PM Andres Freund <[email protected]> wrote:

> Is that actually a good description of what we assume? I don't know where
> that
> 90% is coming from?

That one's all my fault.  It was an attempt to curve-fit backwards why the
4.0 number Tom set with his initial commit worked as well as it did given
that underlying storage was closer to 50X as slow, and I sold the idea well
enough for Bruce to follow the reasoning and commit it.  Back then there
was a regular procession of people who measured the actual rate and
wondered why there was the order of magnitude difference between those
measurements and the parameter.  Pointing them toward thinking in terms of
the cached read percentage too did a reasonable job of deflecting them onto
why the model was more complicated than it seems.  I intended to follow
that up with more measurements, only to lose the whole project into a
non-disclosure void I have only recently escaped

I agree with your observation that the underlying cost of a non-sequential
read stall on cloud storage is not markedly better than the original
random: sequential ratio of mechanical drives.   And the PG17 refactoring
to improve I/O chunking worked to magnify that further.

The end of this problem I'm working on again is assembling some useful mix
of workloads such that I can try changing one of these magic constants with
higher confidence. My main working set so far is write performance
regression test sets against the Open Street Map loading workload, that
I've been blogging about, plus the old read-only queries of the SELECT-only
spaced along a scale/client grid.  My experiments so far have been around
another Tom special, the maximum buffer usage count limit,  which turned
into another black hole full of work I have only recently escaped.  I
haven't really thought much yet about a workload set that would allow
adjusting random_page_cost.  On the query side we've been pretty heads down
on the TPC-H and Clickbench sets.  I don't have buffer internals data from
those yet though, will have to add that to the work queue.

--
Greg Smith
Director of Open Source Strategy, Crunchy Data
[email protected]

view thread (10+ messages)

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Increase default maintenance_io_concurrency to 16
  In-Reply-To: <CAHLJuCVJkfNozHikJYGUe+xHnPGL-+YB7RWsuC0c_XR0ytrnQg@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox