public inbox for [email protected]
help / color / mirror / Atom feedFrom: Andrew Pogrebnoi <[email protected]>
To: Andres Freund <[email protected]>
Cc: [email protected]
Cc: Heikki Linnakangas <[email protected]>
Cc: Robert Haas <[email protected]>
Cc: Thomas Munro <[email protected]>
Cc: Matthias van de Meent <[email protected]>
Subject: Re: Lowering the default wal_blocksize to 4K
Date: Tue, 17 Feb 2026 22:06:14 +0200
Message-ID: <CA+aWR11Gvz4RvPaBx1kfpTx2QxpNW46Yv120dDcUA93UbYcxrg@mail.gmail.com> (raw)
In-Reply-To: <x56pxq6jpuftn6ear3uwbz5tyb4r37itv5zetvcucq3td57apc@yq7jyka7tszv>
References: <[email protected]>
<[email protected]>
<x56pxq6jpuftn6ear3uwbz5tyb4r37itv5zetvcucq3td57apc@yq7jyka7tszv>
Hi,
On Mon, Feb 16, 2026 at 11:13 PM Andres Freund <[email protected]> wrote:
> Hi,
>
> On 2026-02-16 10:04:37 +0200, Andy Pogrebnoi wrote:
> > Since we recycle WAL segments, the added size won't go to the disk
usage but
> > rather cause a bit more freqent segment.
>
> I don't think that's a valid argument though, how much WAL needs to be
> archived is a relevant factor.
Yes, indeed.
> If NBuffers / 32 < wal_segment_size / XLOG_BLCKSZ, the chosen xbuffers
value
> does not depend on XLOG_BLCKSZ.
>
> To me the code only makes sense if you assume that NBuffers / 32 gives
you a
> value in the same domain as data blocks, otherwise NBuffers / 32 is not
the
> approximation of %3 that the comment talks about.
>
>
> I think the code just needs to be fixed to multiply NBuffers * BLCKSZ and
then
> divide that by XLOG_BLCKSZ.
You are right, my bad, fixed (v2-0002).
> I think the auto-tuning bit above needs to be fixed, and it's probably
worth
> manually testing a pg_upgrade from 8kB XLOG_BLCKSZ to 4kB. It should
work, but
pg_upgrade ran with no issues, and the database started with the new (4kB)
XLOG_BLCKSZ
I also found and fixed some more mentions of 8kB as the default for
XLOG_BLCKSZ in the
documentation (v2-0001).
---
Cheers,
Andy
Attachments:
[application/octet-stream] v2-0001-Change-default-wal_blocksize-to-4KB.patch (6.1K, 3-v2-0001-Change-default-wal_blocksize-to-4KB.patch)
download | inline diff:
From 63279ec5776d292ce3bde860ec987726a775dca9 Mon Sep 17 00:00:00 2001
From: Andrew Pogrebnoi <[email protected]>
Date: Fri, 13 Feb 2026 16:14:45 +0200
Subject: [PATCH v2 1/2] Change default wal_blocksize to 4KB
With the current 8KB default, we do more flushes of partially written
buffers, which significantly increases the number of disk writes
compared to 4KB XLOG_BLCKSZ. 4KB pages result in more overhead for
headers, but it is small in comarison with the actual WAL data (~0.29%
of space is headers with 8KB pages, and 0.59% with 4KB).
See more on justification and benchmarks:
https://www.postgresql.org/message-id/20231009230805.funj5ipoggjyzjz6%40awork3.anarazel.de
---
configure | 4 ++--
configure.ac | 4 ++--
doc/src/sgml/config.sgml | 6 +++---
doc/src/sgml/installation.sgml | 4 ++--
doc/src/sgml/wal.sgml | 2 +-
meson_options.txt | 2 +-
6 files changed, 11 insertions(+), 11 deletions(-)
diff --git a/configure b/configure
index ba293931878..06b50b61d25 100755
--- a/configure
+++ b/configure
@@ -1574,7 +1574,7 @@ Optional Packages:
--with-segsize-blocks=SEGSIZE_BLOCKS
set table segment size in blocks [0]
--with-wal-blocksize=BLOCKSIZE
- set WAL block size in kB [8]
+ set WAL block size in kB [4]
--with-llvm build with LLVM based JIT support
--without-icu build without ICU support
--with-tcl build Tcl modules (PL/Tcl)
@@ -3840,7 +3840,7 @@ if test "${with_wal_blocksize+set}" = set; then :
esac
else
- wal_blocksize=8
+ wal_blocksize=4
fi
diff --git a/configure.ac b/configure.ac
index 412fe358a2f..59678a7dd99 100644
--- a/configure.ac
+++ b/configure.ac
@@ -324,9 +324,9 @@ AC_DEFINE_UNQUOTED([RELSEG_SIZE], ${RELSEG_SIZE}, [
# WAL block size
#
AC_MSG_CHECKING([for WAL block size])
-PGAC_ARG_REQ(with, wal-blocksize, [BLOCKSIZE], [set WAL block size in kB [8]],
+PGAC_ARG_REQ(with, wal-blocksize, [BLOCKSIZE], [set WAL block size in kB [4]],
[wal_blocksize=$withval],
- [wal_blocksize=8])
+ [wal_blocksize=4])
case ${wal_blocksize} in
1) XLOG_BLCKSZ=1024;;
2) XLOG_BLCKSZ=2048;;
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index f1af1505cf3..f06de24bd04 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -3536,7 +3536,7 @@ include_dir 'conf.d'
but any positive value less than <literal>32kB</literal> will be
treated as <literal>32kB</literal>.
If this value is specified without units, it is taken as WAL blocks,
- that is <symbol>XLOG_BLCKSZ</symbol> bytes, typically 8kB.
+ that is <symbol>XLOG_BLCKSZ</symbol> bytes, typically 4kB.
This parameter can only be set at server start.
</para>
@@ -3596,7 +3596,7 @@ include_dir 'conf.d'
flushed to disk. If <varname>wal_writer_flush_after</varname> is set
to <literal>0</literal> then WAL data is always flushed immediately.
If this value is specified without units, it is taken as WAL blocks,
- that is <symbol>XLOG_BLCKSZ</symbol> bytes, typically 8kB.
+ that is <symbol>XLOG_BLCKSZ</symbol> bytes, typically 4kB.
The default is <literal>1MB</literal>.
This parameter can only be set in the
<filename>postgresql.conf</filename> file or on the server command line.
@@ -12146,7 +12146,7 @@ dynamic_library_path = '/usr/local/lib/postgresql:$libdir'
<para>
Reports the size of a WAL disk block. It is determined by the value
of <literal>XLOG_BLCKSZ</literal> when building the server. The default value
- is 8192 bytes.
+ is 4096 bytes.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/installation.sgml b/doc/src/sgml/installation.sgml
index c903ccff988..6f3409406db 100644
--- a/doc/src/sgml/installation.sgml
+++ b/doc/src/sgml/installation.sgml
@@ -1520,7 +1520,7 @@ build-postgresql:
<listitem>
<para>
Set the <firstterm>WAL block size</firstterm>, in kilobytes. This is the unit
- of storage and I/O within the WAL log. The default, 8 kilobytes,
+ of storage and I/O within the WAL log. The default, 4 kilobytes,
is suitable for most situations; but other values may be useful
in special cases.
The value must be a power of 2 between 1 and 64 (kilobytes).
@@ -3047,7 +3047,7 @@ ninja install
<listitem>
<para>
Set the <firstterm>WAL block size</firstterm>, in kilobytes. This is the unit
- of storage and I/O within the WAL log. The default, 8 kilobytes,
+ of storage and I/O within the WAL log. The default, 4 kilobytes,
is suitable for most situations; but other values may be useful
in special cases.
The value must be a power of 2 between 1 and 64 (kilobytes).
diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index f3b86b26be9..731b8b6c6f0 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -881,7 +881,7 @@
<filename>pg_wal</filename> under the data directory, as a set of
segment files, normally each 16 MB in size (but the size can be changed
by altering the <option>--wal-segsize</option> <application>initdb</application> option). Each segment is
- divided into pages, normally 8 kB each (this size can be changed via the
+ divided into pages, normally 4 kB each (this size can be changed via the
<option>--with-wal-blocksize</option> configure option). The WAL record headers
are described in <filename>access/xlogrecord.h</filename>; the record
content is dependent on the type of event that is being logged. Segment
diff --git a/meson_options.txt b/meson_options.txt
index 6a793f3e479..22d936973d8 100644
--- a/meson_options.txt
+++ b/meson_options.txt
@@ -9,7 +9,7 @@ option('blocksize', type: 'combo',
option('wal_blocksize', type: 'combo',
choices: ['1', '2', '4', '8', '16', '32', '64'],
- value: '8',
+ value: '4',
description: 'WAL block size, in kilobytes')
option('segsize', type: 'integer', value: 1,
--
2.43.0
[application/octet-stream] v2-0002-Fix-XLOG-buffers-auto-tune-for-different-XLOG_BLC.patch (942B, 4-v2-0002-Fix-XLOG-buffers-auto-tune-for-different-XLOG_BLC.patch)
download | inline diff:
From 72a70460d6966f9895ea874514c77a9405859057 Mon Sep 17 00:00:00 2001
From: Andrew Pogrebnoi <[email protected]>
Date: Tue, 17 Feb 2026 21:12:21 +0200
Subject: [PATCH v2 2/2] Fix XLOG buffers auto-tune for different XLOG_BLCKSZ
Before this change, XLOG buffers calculation in relation to
shared_buffers made sense only if XLOG_BLCKSZ the same as BLCKSZ
---
src/backend/access/transam/xlog.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 13ec6225b85..d37e85a4077 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4694,7 +4694,7 @@ XLOGChooseNumBuffers(void)
{
int xbuffers;
- xbuffers = NBuffers / 32;
+ xbuffers = ((NBuffers * BLCKSZ) / XLOG_BLCKSZ) / 32;
if (xbuffers > (wal_segment_size / XLOG_BLCKSZ))
xbuffers = (wal_segment_size / XLOG_BLCKSZ);
if (xbuffers < 8)
--
2.43.0
view thread (7+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: Lowering the default wal_blocksize to 4K
In-Reply-To: <CA+aWR11Gvz4RvPaBx1kfpTx2QxpNW46Yv120dDcUA93UbYcxrg@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox