public inbox for [email protected]  
help / color / mirror / Atom feed
From: Andrew Pogrebnoi <[email protected]>
To: Andres Freund <[email protected]>
Cc: [email protected]
Cc: Heikki Linnakangas <[email protected]>
Cc: Robert Haas <[email protected]>
Cc: Thomas Munro <[email protected]>
Cc: Matthias van de Meent <[email protected]>
Subject: Re: Lowering the default wal_blocksize to 4K
Date: Tue, 17 Feb 2026 22:06:14 +0200
Message-ID: <CA+aWR11Gvz4RvPaBx1kfpTx2QxpNW46Yv120dDcUA93UbYcxrg@mail.gmail.com> (raw)
In-Reply-To: <x56pxq6jpuftn6ear3uwbz5tyb4r37itv5zetvcucq3td57apc@yq7jyka7tszv>
References: <[email protected]>
	<[email protected]>
	<x56pxq6jpuftn6ear3uwbz5tyb4r37itv5zetvcucq3td57apc@yq7jyka7tszv>

Hi,

On Mon, Feb 16, 2026 at 11:13 PM Andres Freund <[email protected]> wrote:
> Hi,
>
> On 2026-02-16 10:04:37 +0200, Andy Pogrebnoi wrote:
> > Since we recycle WAL segments, the added size won't go to the disk
usage but
> > rather cause a bit more freqent segment.
>
> I don't think that's a valid argument though, how much WAL needs to be
> archived is a relevant factor.

Yes, indeed.


> If NBuffers / 32 < wal_segment_size / XLOG_BLCKSZ, the chosen xbuffers
value
> does not depend on XLOG_BLCKSZ.
>
> To me the code only makes sense if you assume that NBuffers / 32 gives
you a
> value in the same domain as data blocks, otherwise NBuffers / 32 is not
the
> approximation of %3 that the comment talks about.
>
>
> I think the code just needs to be fixed to multiply NBuffers * BLCKSZ and
then
> divide that by XLOG_BLCKSZ.

You are right, my bad, fixed (v2-0002).


> I think the auto-tuning bit above needs to be fixed, and it's probably
worth
> manually testing a pg_upgrade from 8kB XLOG_BLCKSZ to 4kB. It should
work, but

pg_upgrade ran with no issues, and the database started with the new (4kB)
XLOG_BLCKSZ


I also found and fixed some more mentions of 8kB as the default for
XLOG_BLCKSZ in the
documentation (v2-0001).

---
Cheers,
Andy


Attachments:

  [application/octet-stream] v2-0001-Change-default-wal_blocksize-to-4KB.patch (6.1K, 3-v2-0001-Change-default-wal_blocksize-to-4KB.patch)
  download | inline diff:
From 63279ec5776d292ce3bde860ec987726a775dca9 Mon Sep 17 00:00:00 2001
From: Andrew Pogrebnoi <[email protected]>
Date: Fri, 13 Feb 2026 16:14:45 +0200
Subject: [PATCH v2 1/2] Change default wal_blocksize to 4KB

With the current 8KB default, we do more flushes of partially written
buffers, which significantly increases the number of disk writes
compared to 4KB XLOG_BLCKSZ. 4KB pages result in more overhead for
headers, but it is small in comarison with the actual WAL data (~0.29%
of space is headers with 8KB pages, and 0.59% with 4KB).

See more on justification and benchmarks:
https://www.postgresql.org/message-id/20231009230805.funj5ipoggjyzjz6%40awork3.anarazel.de
---
 configure                      | 4 ++--
 configure.ac                   | 4 ++--
 doc/src/sgml/config.sgml       | 6 +++---
 doc/src/sgml/installation.sgml | 4 ++--
 doc/src/sgml/wal.sgml          | 2 +-
 meson_options.txt              | 2 +-
 6 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/configure b/configure
index ba293931878..06b50b61d25 100755
--- a/configure
+++ b/configure
@@ -1574,7 +1574,7 @@ Optional Packages:
   --with-segsize-blocks=SEGSIZE_BLOCKS
                           set table segment size in blocks [0]
   --with-wal-blocksize=BLOCKSIZE
-                          set WAL block size in kB [8]
+                          set WAL block size in kB [4]
   --with-llvm             build with LLVM based JIT support
   --without-icu           build without ICU support
   --with-tcl              build Tcl modules (PL/Tcl)
@@ -3840,7 +3840,7 @@ if test "${with_wal_blocksize+set}" = set; then :
   esac
 
 else
-  wal_blocksize=8
+  wal_blocksize=4
 fi
 
 
diff --git a/configure.ac b/configure.ac
index 412fe358a2f..59678a7dd99 100644
--- a/configure.ac
+++ b/configure.ac
@@ -324,9 +324,9 @@ AC_DEFINE_UNQUOTED([RELSEG_SIZE], ${RELSEG_SIZE}, [
 # WAL block size
 #
 AC_MSG_CHECKING([for WAL block size])
-PGAC_ARG_REQ(with, wal-blocksize, [BLOCKSIZE], [set WAL block size in kB [8]],
+PGAC_ARG_REQ(with, wal-blocksize, [BLOCKSIZE], [set WAL block size in kB [4]],
              [wal_blocksize=$withval],
-             [wal_blocksize=8])
+             [wal_blocksize=4])
 case ${wal_blocksize} in
   1) XLOG_BLCKSZ=1024;;
   2) XLOG_BLCKSZ=2048;;
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index f1af1505cf3..f06de24bd04 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -3536,7 +3536,7 @@ include_dir 'conf.d'
         but any positive value less than <literal>32kB</literal> will be
         treated as <literal>32kB</literal>.
         If this value is specified without units, it is taken as WAL blocks,
-        that is <symbol>XLOG_BLCKSZ</symbol> bytes, typically 8kB.
+        that is <symbol>XLOG_BLCKSZ</symbol> bytes, typically 4kB.
         This parameter can only be set at server start.
        </para>
 
@@ -3596,7 +3596,7 @@ include_dir 'conf.d'
         flushed to disk.  If <varname>wal_writer_flush_after</varname> is set
         to <literal>0</literal> then WAL data is always flushed immediately.
         If this value is specified without units, it is taken as WAL blocks,
-        that is <symbol>XLOG_BLCKSZ</symbol> bytes, typically 8kB.
+        that is <symbol>XLOG_BLCKSZ</symbol> bytes, typically 4kB.
         The default is <literal>1MB</literal>.
         This parameter can only be set in the
         <filename>postgresql.conf</filename> file or on the server command line.
@@ -12146,7 +12146,7 @@ dynamic_library_path = '/usr/local/lib/postgresql:$libdir'
        <para>
         Reports the size of a WAL disk block.  It is determined by the value
         of <literal>XLOG_BLCKSZ</literal> when building the server. The default value
-        is 8192 bytes.
+        is 4096 bytes.
        </para>
       </listitem>
      </varlistentry>
diff --git a/doc/src/sgml/installation.sgml b/doc/src/sgml/installation.sgml
index c903ccff988..6f3409406db 100644
--- a/doc/src/sgml/installation.sgml
+++ b/doc/src/sgml/installation.sgml
@@ -1520,7 +1520,7 @@ build-postgresql:
        <listitem>
         <para>
          Set the <firstterm>WAL block size</firstterm>, in kilobytes.  This is the unit
-         of storage and I/O within the WAL log.  The default, 8 kilobytes,
+         of storage and I/O within the WAL log.  The default, 4 kilobytes,
          is suitable for most situations; but other values may be useful
          in special cases.
          The value must be a power of 2 between 1 and 64 (kilobytes).
@@ -3047,7 +3047,7 @@ ninja install
       <listitem>
        <para>
         Set the <firstterm>WAL block size</firstterm>, in kilobytes.  This is the unit
-        of storage and I/O within the WAL log.  The default, 8 kilobytes,
+        of storage and I/O within the WAL log.  The default, 4 kilobytes,
         is suitable for most situations; but other values may be useful
         in special cases.
         The value must be a power of 2 between 1 and 64 (kilobytes).
diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index f3b86b26be9..731b8b6c6f0 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -881,7 +881,7 @@
    <filename>pg_wal</filename> under the data directory, as a set of
    segment files, normally each 16 MB in size (but the size can be changed
    by altering the <option>--wal-segsize</option> <application>initdb</application> option).  Each segment is
-   divided into pages, normally 8 kB each (this size can be changed via the
+   divided into pages, normally 4 kB each (this size can be changed via the
    <option>--with-wal-blocksize</option> configure option).  The WAL record headers
    are described in <filename>access/xlogrecord.h</filename>; the record
    content is dependent on the type of event that is being logged.  Segment
diff --git a/meson_options.txt b/meson_options.txt
index 6a793f3e479..22d936973d8 100644
--- a/meson_options.txt
+++ b/meson_options.txt
@@ -9,7 +9,7 @@ option('blocksize', type: 'combo',
 
 option('wal_blocksize', type: 'combo',
   choices: ['1', '2', '4', '8', '16', '32', '64'],
-  value: '8',
+  value: '4',
   description: 'WAL block size, in kilobytes')
 
 option('segsize', type: 'integer', value: 1,
-- 
2.43.0



  [application/octet-stream] v2-0002-Fix-XLOG-buffers-auto-tune-for-different-XLOG_BLC.patch (942B, 4-v2-0002-Fix-XLOG-buffers-auto-tune-for-different-XLOG_BLC.patch)
  download | inline diff:
From 72a70460d6966f9895ea874514c77a9405859057 Mon Sep 17 00:00:00 2001
From: Andrew Pogrebnoi <[email protected]>
Date: Tue, 17 Feb 2026 21:12:21 +0200
Subject: [PATCH v2 2/2] Fix XLOG buffers auto-tune for different XLOG_BLCKSZ

Before this change, XLOG buffers calculation in relation to
shared_buffers made sense only if XLOG_BLCKSZ the same as BLCKSZ
---
 src/backend/access/transam/xlog.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 13ec6225b85..d37e85a4077 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4694,7 +4694,7 @@ XLOGChooseNumBuffers(void)
 {
 	int			xbuffers;
 
-	xbuffers = NBuffers / 32;
+	xbuffers = ((NBuffers * BLCKSZ) / XLOG_BLCKSZ) / 32;
 	if (xbuffers > (wal_segment_size / XLOG_BLCKSZ))
 		xbuffers = (wal_segment_size / XLOG_BLCKSZ);
 	if (xbuffers < 8)
-- 
2.43.0



view thread (7+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Lowering the default wal_blocksize to 4K
  In-Reply-To: <CA+aWR11Gvz4RvPaBx1kfpTx2QxpNW46Yv120dDcUA93UbYcxrg@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox