Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tugkM-00Ajjt-UU for pgsql-hackers@arkaria.postgresql.org; Tue, 18 Mar 2025 23:53:23 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1tugjM-0084aZ-5c for pgsql-hackers@arkaria.postgresql.org; Tue, 18 Mar 2025 23:52:20 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tugjL-0084Zm-O2 for pgsql-hackers@lists.postgresql.org; Tue, 18 Mar 2025 23:52:19 +0000 Received: from mail-pj1-x1033.google.com ([2607:f8b0:4864:20::1033]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1tugjJ-003b13-2K for pgsql-hackers@lists.postgresql.org; Tue, 18 Mar 2025 23:52:18 +0000 Received: by mail-pj1-x1033.google.com with SMTP id 98e67ed59e1d1-3015001f862so4999586a91.3 for ; Tue, 18 Mar 2025 16:52:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1742341937; x=1742946737; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=q/tnuiCOneOXOOy4PHk0ACXrX/JLVocjgwizamJJdI4=; b=HRtdL7rViXMMufDdyDfuLdxsCxCm1G/GqvtdlGu7J/ThulxC4tUJTrl/KUVujaW3qJ bhJw9HmA4W2WTnUZH9owoME72sg9VrKayVr7gpqkHuflCfC7Nrsc/1d87D1vYOqyTB+y VQGkfoShhDNR/5aKopIWq2EIFEDHLLfiiKHCjM1yPC5UbGeWIcz9Q82u0j0UPfmN5Kku 5E9hBi8AGqbMrusc0PCnRH7FyzOUnwEdCNldRrmMFXaPRXrSi6t+lcoVaqQr14gN1vAk 3v1gdi2od0656q2yDjlhX/RkKbNg9Ctz2vwaLpanYvrV9F23K9rhXLFZ4CQxI2PsMzDW PZPw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1742341937; x=1742946737; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=q/tnuiCOneOXOOy4PHk0ACXrX/JLVocjgwizamJJdI4=; b=pt1zUplOCytJjqHBQop1v/LGZe/tOoquZcCqC2ykdJ5qjaGB1kXDU/YI6BTsHVdGKv KeFvcaRjtYM9y7SPbpqAR42mSx8U++GlX4oucXcwrMp7/D4+36wzOI9NzoXaXtDciX3i gpeFvtnnC5ejmfWVoqIPrJQXaj3r/s2q8ROD2JOhJEVpZoofar0ZRcX/UEG17FqLB1/y n/H8YU5KLa9ZH9nhyDG1i/4IR9Z5WH3Aj7UatDEB6VaYNAHSWhO4KqdH3d0sYPuuKDS8 Cq4niP7cT06h+Aw2CHi+QCSlr42TCNZzl9z0pJIu96y8N8RcVIggtBqU/9id2QFl+n8a 2zkg== X-Forwarded-Encrypted: i=1; AJvYcCVwDJBdjM16zc04MsaPgcU9hbV0S6fCuugzacahfm5Zt/sA0k0TOCY6pYnTY3ZQiNmKm4T50jlIKc1PEwUc@lists.postgresql.org X-Gm-Message-State: AOJu0YwIP7hiB2uqYLFvGiHuPc52mjegLuY53WsKHQ9HYMHtP7yqcHHH QK1eJqwKaDRyw8r11wXffOVyOtUNubvSChrG5oUUU2u0N/+mDUHaJPD4UwezuZhQQXKIIWSHpqe wEtbw6zYhMCDSjZRJzah8uirXzzc= X-Gm-Gg: ASbGncvO1PpdX36wnXYRzoZEd4zt7MpugXF/0qZQeQKeFECrddjT82BYRLUARqWjBFw ojy1LRJIFoTjZXioW+QssP6FXvV9w2NlGGZCtxSBPTVcYFK17NfWtkKQDs9gfhCDrG4Fs7qMUBT vRSY3pBsNcamZ83NkEKg8YBNj8QOI= X-Google-Smtp-Source: AGHT+IGnN47w7YmRsaDmO0fAv9VI+UjGQvYu40iLV7hA8QDqzsrWN1BgK4OFPh0yh4cu3h5Ox1m+wxI86jMMflqrhiM= X-Received: by 2002:a17:90b:4b09:b0:2ff:53ad:a0f4 with SMTP id 98e67ed59e1d1-301bde70422mr998710a91.12.1742341936936; Tue, 18 Mar 2025 16:52:16 -0700 (PDT) MIME-Version: 1.0 References: <4p7gtb2nfr3njhgq7bmpe24unsbyoerlom7zrcu5sl2vyyutlp@ol5ywrm7j5ok> In-Reply-To: <4p7gtb2nfr3njhgq7bmpe24unsbyoerlom7zrcu5sl2vyyutlp@ol5ywrm7j5ok> From: Gregory Smith Date: Tue, 18 Mar 2025 19:52:04 -0400 X-Gm-Features: AQ5f1JrUercZG5KC74gxQ-zEc8kzt1Giq-LOEIbwdU5xDg7_TbmN6MPt1DiQmP0 Message-ID: Subject: Re: Increase default maintenance_io_concurrency to 16 To: Andres Freund Cc: Bruce Momjian , Melanie Plageman , PostgreSQL-development , Greg Smith Content-Type: multipart/alternative; boundary="0000000000003a80bc0630a698a4" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --0000000000003a80bc0630a698a4 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, Mar 18, 2025 at 5:04=E2=80=AFPM Andres Freund = wrote: > Is that actually a good description of what we assume? I don't know where > that > 90% is coming from? That one's all my fault. It was an attempt to curve-fit backwards why the 4.0 number Tom set with his initial commit worked as well as it did given that underlying storage was closer to 50X as slow, and I sold the idea well enough for Bruce to follow the reasoning and commit it. Back then there was a regular procession of people who measured the actual rate and wondered why there was the order of magnitude difference between those measurements and the parameter. Pointing them toward thinking in terms of the cached read percentage too did a reasonable job of deflecting them onto why the model was more complicated than it seems. I intended to follow that up with more measurements, only to lose the whole project into a non-disclosure void I have only recently escaped I agree with your observation that the underlying cost of a non-sequential read stall on cloud storage is not markedly better than the original random: sequential ratio of mechanical drives. And the PG17 refactoring to improve I/O chunking worked to magnify that further. The end of this problem I'm working on again is assembling some useful mix of workloads such that I can try changing one of these magic constants with higher confidence. My main working set so far is write performance regression test sets against the Open Street Map loading workload, that I've been blogging about, plus the old read-only queries of the SELECT-only spaced along a scale/client grid. My experiments so far have been around another Tom special, the maximum buffer usage count limit, which turned into another black hole full of work I have only recently escaped. I haven't really thought much yet about a workload set that would allow adjusting random_page_cost. On the query side we've been pretty heads down on the TPC-H and Clickbench sets. I don't have buffer internals data from those yet though, will have to add that to the work queue. -- Greg Smith Director of Open Source Strategy, Crunchy Data greg.smith@crunchydata.com --0000000000003a80bc0630a698a4 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Tue, Mar 18, 2025 at 5:04=E2=80= =AFPM Andres Freund <andres@anaraz= el.de> wrote:
=
Is that actually a good d= escription of what we assume? I don't know where that
90% is coming from?

That one's all my fault.= =C2=A0 It was an attempt to curve-fit backwards why the 4.0 number Tom set = with his initial commit worked as well as it did given that underlying stor= age was closer to 50X as slow, and I sold the idea well enough for Bruce to= follow the reasoning and commit it.=C2=A0 Back then there was a regular pr= ocession of people who measured the actual rate and wondered why there was = the order of magnitude difference between those measurements and the parame= ter.=C2=A0 Pointing them toward thinking in terms of the cached read percen= tage too did a reasonable job of deflecting them onto why the model was mor= e complicated than it seems.=C2=A0 I intended to follow that up with more m= easurements, only to lose the whole project into a non-disclosure void I ha= ve only recently escaped

I agree with= your observation that the underlying cost of a non-sequential read stall o= n cloud storage is not markedly better than the original random: sequential= ratio of mechanical drives.=C2=A0=C2=A0 And the PG17 refactoring to improv= e I/O chunking worked to magnify that further.

The end of this problem I'm working on again is assembling so= me useful mix of workloads such that I can try changing one of these magic = constants with higher confidence. My main working set so far is write perfo= rmance regression test sets against the Open Street Map loading workload, t= hat I've been blogging about, plus the old read-only queries of the SEL= ECT-only spaced along a scale/client grid.=C2=A0 My experiments so far have= been around another Tom special, the maximum buffer usage count limit,=C2= =A0 which turned into another black hole full of work I have only recently = escaped.=C2=A0 I haven't really thought much yet about a workload set t= hat would allow adjusting random_page_cost.=C2=A0 On the query side we'= ve been pretty heads down on the TPC-H and Clickbench sets.=C2=A0 I don'= ;t have buffer internals data from those yet though, will have to add that = to the work queue.
--
Greg Smith
Director of Open Source Strategy, Crunchy Data=
--0000000000003a80bc0630a698a4--