Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1ufTcg-007noG-JW for pgsql-general@arkaria.postgresql.org; Sat, 26 Jul 2025 01:22:51 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1ufTcf-0061Zy-Nb for pgsql-general@arkaria.postgresql.org; Sat, 26 Jul 2025 01:22:50 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1ufTcf-0061Zp-BI for pgsql-general@lists.postgresql.org; Sat, 26 Jul 2025 01:22:49 +0000 Received: from fout-a8-smtp.messagingengine.com ([103.168.172.151]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1ufTcc-000rl9-0G for pgsql-general@lists.postgresql.org; Sat, 26 Jul 2025 01:22:49 +0000 Received: from phl-compute-01.internal (phl-compute-01.phl.internal [10.202.2.41]) by mailfout.phl.internal (Postfix) with ESMTP id 4FCB2EC0191; Fri, 25 Jul 2025 21:22:45 -0400 (EDT) Received: from phl-imap-04 ([10.202.2.82]) by phl-compute-01.internal (MEProxy); Fri, 25 Jul 2025 21:22:45 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=barre.sh; h=cc :content-transfer-encoding:content-type:content-type:date:date :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to; s=fm1; t=1753492965; x=1753579365; bh=1QUlIjf0g7i0fzR9gDtav/Mypo6pEboDsIxxLdwqno4=; b= OPpVDEHxHB/OWbviDLQaFSefPUaq2d0Y2xq0UEv8rTWjANJ+5+IKc/rn5ACuMmt7 aBKk5J18sp9371hcyoGneN+XQG/amo7FF3nxn6fIo2VldT3R4PiM/Z51wv7xRCk+ AcNsO/6DFufeph+ZDgi7VqSoRzTF9K8ekbjbRWtr9KkEZ7AyxEz15/CESmsw2FTR 9rcfy6bpZbRWsKofVZdttRwAez5sWsEi/h1JZ8DlK2njvYGSq6vxey8ypvp7tzxf ZKSPmHm2HdemJQcC2GjVFuI66gdZDVqDB1xhNJhwJny5LthuisZbaT39rahwrpI3 iY53YKLtoUs6xUmffXXHew== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :content-type:date:date:feedback-id:feedback-id:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:subject:subject:to:to:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm2; t=1753492965; x=1753579365; bh=1 QUlIjf0g7i0fzR9gDtav/Mypo6pEboDsIxxLdwqno4=; b=kOCfmIFUIJfznjdUC HnBaT2sO6GK1vXyOZZrripPviUs7SPfKmgONz+TIvCwjQnjo9xpiStJ5jvy/Y2RS OUVhVrFqD3QWLIiEdjFwSOyCmdG83SZ7wOJggKsvBJofjKrDF7AAF9jKat0bkYWZ S9Xv9ADb2AUoTIwz+F4SeSDqs5S9T8rVoLYhjsY0m3Yrml9aDu2UHTtOEijyFkY0 CcWKes3rUOzJU1xY+S9gXh+mu78J229f2/pFi3PLgguSZuYwWGJVfXhOs+NAKmw6 mWbBMy+iA45WW8ARB0XzLDlMDzpxaFyH864nzOom5igiVCeaGtAeik6Dfu2f3btB FuNhw== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeffedrtdefgdekhedtkecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpuffrtefokffrpgfnqfghnecuuegr ihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenucfjug hrpefoggffhffvkfgjfhfutgfgsehtqhertdertdejnecuhfhrohhmpedfrfhivghrrhgv uceurghrrhgvfdcuoehpihgvrhhrvgessggrrhhrvgdrshhhqeenucggtffrrghtthgvrh hnpedugeefieejveefgeekteeuhfeuveevtdejieejgfffhffgfeeukeekudekkeefkeen ucffohhmrghinhepghhithhhuhgsrdgtohhmnecuvehluhhsthgvrhfuihiivgeptdenuc frrghrrghmpehmrghilhhfrhhomhepphhivghrrhgvsegsrghrrhgvrdhshhdpnhgspghr tghpthhtohepvddpmhhouggvpehsmhhtphhouhhtpdhrtghpthhtohepphhgshhqlhdqgh gvnhgvrhgrlheslhhishhtshdrphhoshhtghhrvghsqhhlrdhorhhgpdhrtghpthhtohep jhhrohhsshesohhpvghnvhhishhtrghsrdhnvght X-ME-Proxy: Feedback-ID: i97614980:Fastmail Received: by mailuser.phl.internal (Postfix, from userid 501) id 079BDB6006C; Fri, 25 Jul 2025 21:22:45 -0400 (EDT) X-Mailer: MessagingEngine.com Webmail Interface MIME-Version: 1.0 X-ThreadId: T89c86ea8eb4c36ce Date: Sat, 26 Jul 2025 03:22:24 +0200 From: "Pierre Barre" To: "Jeff Ross" , pgsql-general@lists.postgresql.org Message-Id: In-Reply-To: References: <8188513c-e089-4273-b2be-16dd0a5a0a80@app.fastmail.com> <5c512367-0f67-4bcc-9897-1acf9c0f8bd3@app.fastmail.com> <60027457-1b85-4a69-a67e-ee87f7cabd61@openvistas.net> <77eb549f-ef2d-46c1-932d-c54247e1400a@app.fastmail.com> Subject: Re: PostgreSQL on S3-backed Block Storage with Near-Local Performance Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk And finally, some read only benchmarks with the same postgres build. 9P: postgres@zerofs:/mnt_9p$ pgbench -vvv -c 100 -j 40 -t 10000 bench -S pgbench (16.9 (Ubuntu 16.10-1)) starting vacuum...end. starting vacuum pgbench_accounts...end. transaction type: scaling factor: 50 query mode: simple number of clients: 100 number of threads: 40 maximum number of tries: 1 number of transactions per client: 10000 number of transactions actually processed: 1000000/1000000 number of failed transactions: 0 (0.000%) latency average =3D 0.539 ms initial connection time =3D 59.157 ms tps =3D 185652.686153 (without initial connection time) ext4: postgres@zerofs:/root$ pgbench -vvv -c 100 -j 40 -t 10000 bench -S pgbench (16.9 (Ubuntu 16.10-1)) starting vacuum...end. starting vacuum pgbench_accounts...end. transaction type: scaling factor: 50 query mode: simple number of clients: 100 number of threads: 40 maximum number of tries: 1 number of transactions per client: 10000 number of transactions actually processed: 1000000/1000000 number of failed transactions: 0 (0.000%) latency average =3D 0.547 ms initial connection time =3D 44.054 ms tps =3D 182836.180428 (without initial connection time) Best, Pierre On Sat, Jul 26, 2025, at 03:16, Pierre Barre wrote: > I built postgres (same version, 16.9) but --with-block-size=3D32 (I'd=20 > really love if this would be a initdb time flag!) and did some more=20 > testing: > > synchronous_commit =3D off > > postgres@zerofs:~$ pgbench -vvv -c 100 -j 40 -t 10000 bench > pgbench (16.9 (Ubuntu 16.10-1)) > starting vacuum...end. > starting vacuum pgbench_accounts...end. > transaction type: > scaling factor: 50 > query mode: simple > number of clients: 100 > number of threads: 40 > maximum number of tries: 1 > number of transactions per client: 10000 > number of transactions actually processed: 1000000/1000000 > number of failed transactions: 0 (0.000%) > latency average =3D 5.727 ms > initial connection time =3D 59.223 ms > tps =3D 17460.128835 (without initial connection time) > > synchronous_commit =3D on=20 > > postgres@zerofs:/root$ pgbench -vvv -c 100 -j 40 -t 1000 bench > pgbench (16.9 (Ubuntu 16.10-1)) > starting vacuum...end. > starting vacuum pgbench_accounts...end. > transaction type: > scaling factor: 50 > query mode: simple > number of clients: 100 > number of threads: 40 > maximum number of tries: 1 > number of transactions per client: 1000 > number of transactions actually processed: 100000/100000 > number of failed transactions: 0 (0.000%) > latency average =3D 301.800 ms > initial connection time =3D 62.237 ms > tps =3D 331.345391 (without initial connection time) > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > > Then, using the same setup (same server, same postgres build), I creat= e=20 > a ZeroFS NBD device with ext4 on top > > /dev/nbd0 on /mnt_9p type ext4 (rw,relatime,stripe=3D32) > > synchronous_commit =3D off > > postgres@zerofs:/mnt_9p$ pgbench -vvv -c 100 -j 40 -t 10000 bench > pgbench (16.9 (Ubuntu 16.10-1)) > starting vacuum...end. > starting vacuum pgbench_accounts...end. > transaction type: > scaling factor: 50 > query mode: simple > number of clients: 100 > number of threads: 40 > maximum number of tries: 1 > number of transactions per client: 10000 > number of transactions actually processed: 1000000/1000000 > number of failed transactions: 0 (0.000%) > latency average =3D 3.615 ms > initial connection time =3D 45.653 ms > tps =3D 27665.373366 (without initial connection time) > > synchronous_commit =3D on > > postgres@zerofs:/root$ pgbench -vvv -c 100 -j 40 -t 1000 bench > pgbench (16.9 (Ubuntu 16.10-1)) > starting vacuum...end. > starting vacuum pgbench_accounts...end. > transaction type: > scaling factor: 50 > query mode: simple > number of clients: 100 > number of threads: 40 > maximum number of tries: 1 > number of transactions per client: 1000 > number of transactions actually processed: 100000/100000 > number of failed transactions: 0 (0.000%) > latency average =3D 337.762 ms > initial connection time =3D 43.969 ms > tps =3D 296.066616 (without initial connection time) > > Best, > Pierre > > > On Fri, Jul 25, 2025, at 11:25, Pierre Barre wrote: >> Hi, >> >> I went ahead and did that test. >> >> Here is the postgresql config I used for reference (note the wal=20 >> options (recycle, init_zero) as well as full_page_writes =3D off, bec= ause=20 >> ZeroFS cannot have torn writes by design). >> >> https://gist.github.com/Barre/8d68f0d00446389998a31f4e60f3276d >> >> Test was running on Azure with Standard D16ads v5 (16 vcpus, 64 GiB m= emory) >> >> This time, I didn't run ZFS with L2ARC, I just mounted ZeroFS with 9p. >> >> synchronous_commit =3D off=20 >> >> postgres@zerofs:~$ pgbench -vvv -c 100 -j 40 -t 1000 bench >> pgbench (16.9 (Ubuntu 16.9-0ubuntu0.24.04.1)) >> starting vacuum...end. >> starting vacuum pgbench_accounts...end. >> transaction type: >> scaling factor: 50 >> query mode: simple >> number of clients: 100 >> number of threads: 40 >> maximum number of tries: 1 >> number of transactions per client: 1000 >> number of transactions actually processed: 100000/100000 >> number of failed transactions: 0 (0.000%) >> latency average =3D 6.239 ms >> initial connection time =3D 68.922 ms >> tps =3D 16026.940646 (without initial connection time) >> >> >> synchronous_commit =3D on >> >> postgres@zerofs:~$ pgbench -vvv -c 50 -j 15 -t 1000 bench >> pgbench (16.9 (Ubuntu 16.9-0ubuntu0.24.04.1)) >> starting vacuum...end. >> starting vacuum pgbench_accounts...end. >> transaction type: >> scaling factor: 50 >> query mode: simple >> number of clients: 50 >> number of threads: 15 >> maximum number of tries: 1 >> number of transactions per client: 1000 >> number of transactions actually processed: 50000/50000 >> number of failed transactions: 0 (0.000%) >> latency average =3D 197.723 ms >> initial connection time =3D 46.089 ms >> tps =3D 252.878721 (without initial connection time) >> >> >> Not great barebones with with synchronous_commit, but still usable! >> >> Best, >> Pierre >> >> On Fri, Jul 25, 2025, at 00:44, Pierre Barre wrote: >>>> This then begs the obvious question of how fast is this with=20 >>>> synchronous_commit =3D on? >>> >>> Probably not awful, especially with commit_delay. >>> >>> I'll try that and report back. >>> >>> Best, >>> Pierre >>> >>> On Fri, Jul 25, 2025, at 00:03, Jeff Ross wrote: >>>> On 7/24/25 13:50, Pierre Barre wrote: >>>> >>>>> It=E2=80=99s not =E2=80=9Csafe=E2=80=9D or =E2=80=9Cunsafe=E2=80=9D= , there=E2=80=99s mountains of valid workloads which don=E2=80=99t requi= re synchronous_commit. Synchronous_commit don=E2=80=99t make your system= automatically safe either, and if that=E2=80=99s a requirement, there=E2= =80=99s many workarounds, as you suggested, it certainly doesn=E2=80=99t= make the setup useless. >>>>> >>>>> Best, >>>>> Pierre >>>>> >>>>> On Thu, Jul 24, 2025, at 21:44, Nico Williams wrote: >>>>>> On Fri, Jul 18, 2025 at 12:57:39PM +0200, Pierre Barre wrote: >>>>>>> - Postgres configured accordingly memory-wise as well as with >>>>>>> synchronous_commit =3D off, wal_init_zero =3D off and wal_rec= ycle =3D off. >>>>>> Bingo. That's why it's fast (synchronous_commit =3D off). It's = also why >>>>>> it's not safe _unless_ you have a local, fast, persistent ZIL dev= ice >>>>>> (which I assume you don't). >>>>>> >>>>>> Nico >>>>>> -- >>>> This then begs the obvious question of how fast is this with=20 >>>> synchronous_commit =3D on?