Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vs5ub-005NaL-2p for pgsql-hackers@arkaria.postgresql.org; Mon, 16 Feb 2026 21:13:46 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vs5ua-005mkt-14 for pgsql-hackers@arkaria.postgresql.org; Mon, 16 Feb 2026 21:13:44 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vs5uZ-005mkb-2C for pgsql-hackers@lists.postgresql.org; Mon, 16 Feb 2026 21:13:44 +0000 Received: from fhigh-b4-smtp.messagingengine.com ([202.12.124.155]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1vs5uX-000000015Ki-16Wq for pgsql-hackers@postgresql.org; Mon, 16 Feb 2026 21:13:43 +0000 Received: from phl-compute-05.internal (phl-compute-05.internal [10.202.2.45]) by mailfhigh.stl.internal (Postfix) with ESMTP id 874707A02B7; Mon, 16 Feb 2026 16:13:38 -0500 (EST) Received: from phl-frontend-03 ([10.202.2.162]) by phl-compute-05.internal (MEProxy); Mon, 16 Feb 2026 16:13:38 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=anarazel.de; h= cc:cc:content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm3; t=1771276418; x=1771362818; bh=mjnPc2KZM/ SdAXOWBqTnKBO7OtfD3riigcCeARc2yzM=; b=ITyvkoi4JIli/Fgf5o5GuCyeAQ G2TmTE1v3G8zAKsIOJq+FK/PEuhOsoWtFnw19ucrVrrFu30PBm4iP8/8mCNXBK8/ U1iTQx8Tq+jbVSFlHcbYjpvW5Jn/C1DG/07bYCe6/fstqRPdEM2/wgG/zRlLoZvl jxM5vjWswl3YAzW7+KS+3PXXzCpNoThyIGF8AJ8uBmeXgifCDSC+rt4R8DoO9zbh +U9IjVpj5RkR8UCuU1/wVckzlLCl92ohWMgA+5TwfkIozroGwOrDYtjv9srvtwML RvZrRurd/lKCuk5IIjslpa/egnl1xvUzuc6BBfdvWzXtpXMtBqFRCMmMn+KA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm3; t= 1771276418; x=1771362818; bh=mjnPc2KZM/SdAXOWBqTnKBO7OtfD3riigcC eARc2yzM=; b=vXamk6NnW2tzrGUKEzVKV71fbmW2GLiXNc6lOoC47GohfprymfI q8TLZGyDsV1TgNW2OHa+aj76VOrdhbulgXwEoeU0Kph8J8QHNcp4W2kAFAOgAU9V zwXsPPYPrwwUkVKzmqsB2EhQhJIoI3zaT25CN0AXM9jFDIskag0au/ReAfeZ7Yv7 6fiXb2yM0ShImJE0crrhnA76ajGCOL+ll0pWuIOHfC0AwPWgWR04ybSpikBLsQe9 53LaNyYSOqm79hyBX4TRs6e371BT8I48wIq1uz+eaKzgBppYYZ9yp61njA/Ek/Pz uNTjYZ2UDKfbJvF5D4qpCO6CwaNqiV81ahQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeefgedrtddtgddvudejledvucetufdoteggodetrf dotffvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfurfetoffkrfgpnffqhgenuceu rghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujf gurhepfffhvfevuffkfhggtggujgesthdtsfdttddtvdenucfhrhhomheptehnughrvghs ucfhrhgvuhhnugcuoegrnhgurhgvshesrghnrghrrgiivghlrdguvgeqnecuggftrfgrth htvghrnhepfeffgfelvdffgedtveelgfdtgefghfdvkefggeetieevjeekteduleevjefh ueegnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomheprg hnughrvghssegrnhgrrhgriigvlhdruggvpdhnsggprhgtphhtthhopeeipdhmohguvgep shhmthhpohhuthdprhgtphhtthhopegsohgvkhgvfihurhhmodhpohhsthhgrhgvshesgh hmrghilhdrtghomhdprhgtphhtthhopehrohgsvghrthhmhhgrrghssehgmhgrihhlrdgt ohhmpdhrtghpthhtohepthhhohhmrghsrdhmuhhnrhhosehgmhgrihhlrdgtohhmpdhrtg hpthhtohephhhlihhnnhgrkhgrsehikhhirdhfihdprhgtphhtthhopegrnhgurhgvfidr phhoghhrvggsnhhoihesphgvrhgtohhnrgdrtghomhdprhgtphhtthhopehpghhsqhhlqd hhrggtkhgvrhhssehpohhsthhgrhgvshhqlhdrohhrgh X-ME-Proxy: Feedback-ID: id4a34324:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 16 Feb 2026 16:13:37 -0500 (EST) Date: Mon, 16 Feb 2026 16:13:37 -0500 From: Andres Freund To: Andy Pogrebnoi Cc: pgsql-hackers@postgresql.org, Heikki Linnakangas , Robert Haas , Thomas Munro , Matthias van de Meent Subject: Re: Lowering the default wal_blocksize to 4K Message-ID: References: <20231009230805.funj5ipoggjyzjz6@awork3.anarazel.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hi, On 2026-02-16 10:04:37 +0200, Andy Pogrebnoi wrote: > > On Oct 10, 2023, at 02:08, Andres Freund wrote: > > > > Hi, > > > > I've mentioned this to a few people before, but forgot to start an actual > > thread. So here we go: > > > > I think we should lower the default wal_blocksize / XLOG_BLCKSZ to 4096, from > > the current 8192. > > I prepared a patch in case we want to move with the default 4kb XLOG_BLCKSZ. I think we should. > Regarding reducing the page headers' size, the benefits of 4Kb wal_blocks > outweight disadvantages of the proportionally bigger header in my opinion. I agree. > Since we recycle WAL segments, the added size won't go to the disk usage but > rather cause a bit more freqent segment. I don't think that's a valid argument though, how much WAL needs to be archived is a relevant factor. > > One thing I noticed is that our auto-configuration of wal_buffers leads to > > different wal_buffers settings for different XLOG_BLCKSZ, which doesn't seem > > great. > > I don't think it's an issue as wal_buffers are in block units, not bytes. Even > though the auto-tuned number may change, the total amount of bytes still remains > the same with different XLOG_BLCKSZ. Given the way the auto-tuning works, I don't think that's true: /* * Auto-tune the number of XLOG buffers. * * The preferred setting for wal_buffers is about 3% of shared_buffers, with * a maximum of one XLOG segment (there is little reason to think that more * is helpful, at least so long as we force an fsync when switching log files) * and a minimum of 8 blocks (which was the default value prior to PostgreSQL * 9.1, when auto-tuning was added). * * This should not be called until NBuffers has received its final value. */ static int XLOGChooseNumBuffers(void) { int xbuffers; xbuffers = NBuffers / 32; if (xbuffers > (wal_segment_size / XLOG_BLCKSZ)) xbuffers = (wal_segment_size / XLOG_BLCKSZ); if (xbuffers < 8) xbuffers = 8; return xbuffers; } If NBuffers / 32 < wal_segment_size / XLOG_BLCKSZ, the chosen xbuffers value does not depend on XLOG_BLCKSZ. To me the code only makes sense if you assume that NBuffers / 32 gives you a value in the same domain as data blocks, otherwise NBuffers / 32 is not the approximation of %3 that the comment talks about. I think the code just needs to be fixed to multiply NBuffers * BLCKSZ and then divide that by XLOG_BLCKSZ. > > > For some example numbers, I ran a very simple insert workload with a varying > > number of clients with both a wal_blocksize=4096 and wal_blocksize=8192 > > cluster, and measured the amount of bytes written before/after. > > I've also run some simple tests on my local machine (Ubuntu in Vagrant on M1 > Mac). I run a sysbench write-only load for 20s with different amounts of threads > (and tables equal to the number of threads num) and measured disk writes with > iostat. I recreated tables and did a checkpoint before each run. These are my > results: > > 8Kb XLOG_BLCKSZ > ==== > Threads tps kB_wrtn > 1 535.34 207288 > 5 1457.24 591708 > 10 1441.85 574700 > 15 823.98 388732 > > 4Kb XLOG_BLCKSZ > ==== > Threads tps kB_wrtn > 1 542.02 153544 > 5 1556.83 393444 > 10 1288.00 339648 > 15 975.32 255708 The reduction in bytes written is rather impressive... > I will run more benchmarks on proper hardware. For example, interesting what > happens to performance with >4K writes. But what else do you think has to be > done to move this patch forward? I think the auto-tuning bit above needs to be fixed, and it's probably worth manually testing a pg_upgrade from 8kB XLOG_BLCKSZ to 4kB. It should work, but ... I think we otherwise should just go for it. Greetings, Andres Freund