Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vY0vN-00Bnqh-2m for pgsql-hackers@arkaria.postgresql.org; Tue, 23 Dec 2025 11:51:34 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vY0vM-001I8v-2n for pgsql-hackers@arkaria.postgresql.org; Tue, 23 Dec 2025 11:51:33 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vY0vM-001I8m-1h for pgsql-hackers@lists.postgresql.org; Tue, 23 Dec 2025 11:51:33 +0000 Received: from mail-pg1-x541.google.com ([2607:f8b0:4864:20::541]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1vY0vK-002KJe-2p for pgsql-hackers@postgresql.org; Tue, 23 Dec 2025 11:51:33 +0000 Received: by mail-pg1-x541.google.com with SMTP id 41be03b00d2f7-bc274b8b15bso5133470a12.1 for ; Tue, 23 Dec 2025 03:51:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1766490688; x=1767095488; darn=postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=35J//J3STXGxn5GIiEFdh1iJxLjxlhFklWnXb6BU4V8=; b=I6JkHiZGduaE0QpZTmzfEdq0vgK0hZg3YT1sOKmxerTO7GqEIAgsHamCGjEPulWB5M kMnUPUqEUGZr+mVNOliQdoAE53ikIrVKx60psEMLZ4GESQxWMD+UoX4pEssHtkyBRbwc DZCh54EHoPUMFLDrzeJs8quaLt98ztwGIJiCcv29TX0lwEhBopdPKOUcvWIgsQsq9Axq FuNmlXs0niGsz9M4ZZM6poQW9pHBUSCoK4tEOcAbZbn38l44ZVhLcKvgJ25btqs3HKup 9MyPhjJJ3EZcPd1r4ZCKNp+vYHsFO+v1vThklPDx7jxJfncb4chLKv4X+N3D2/DirEC1 As2g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1766490688; x=1767095488; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=35J//J3STXGxn5GIiEFdh1iJxLjxlhFklWnXb6BU4V8=; b=mAiMVdf3nn1+BwiJpL/QgHkdGh948XlpFEZOVebuGzEEzGaAOxZDNL8PuQSO1sMWUO 4UFRYcgo7tW09F/hkZz9ADccf5ZTPTCOyZFtnjwhcG2bGCOt+Scbr0W+wEajDd4U+GXY 2G+cKidYXMINqQLabNmjh5bDuZfZzbtSZ73XTIImaEiZ3vBDL+jHadt62dqlLRA06agA IVhfOr8Vp5NREUr0zKFlro5TsmplUYuKgQjqKplUpF5Q3JVD9fEHsg2yb2ehNVoCGKQ1 XFD9UyK1ioFb0fRZOtjqLuAfNNYQI30B9YtL8IauPpbESNYdykSZB/zrOeKkfPIcVHvR R2lw== X-Gm-Message-State: AOJu0YzE016apigHxW79aHg1Y7uBfpWsdLPxFR3GXgytwBQMDEIfe1fw +Qisr/w0AhUMkeTHECPTrceUqXsL+w2aO+CAP5ttJ82hoH990VX7zM1v03mKwZznW83zPWeo4jE z2xLlSpNNjvlMGo5muiSVaa4v/DYvSL8= X-Gm-Gg: AY/fxX6M+NiKDpV5PsUtk/QuQjjGG40yuv9YyERwH4SjMxQyXhbq5Fu3RzNrDsnAb/l pOvdo4q7T4p+EsVhwMLIC59sbvK5QBUrc1oH1IvoMUCXQfDZwGaKduLLWa6R5mtdLv9ZCcrqLwK +Sa4As+6f+6jeu6VVxG0l/1Bp6mudPGh/veYllk2PBYzMvLFNJ5h7fbfq13Zzy/UQm1jwQ4uvsF 9IuXvO7MNYd2eNAWYHMi0RTtqsPC4e0UYW368NpsV/6CmLQO2xI2oBDBs2yUhhyafXn+Wck8ZWd wxACZw== X-Google-Smtp-Source: AGHT+IEo1MtyLAjwATGFooTkMNMLaz1yeiwjuxu5UIjo2t03ILaKa6VA1w8rimDU2RwxD+PL/boKE14gozGA+ANBSAI= X-Received: by 2002:a05:7301:df4a:b0:2ac:2c08:9017 with SMTP id 5a478bee46e88-2b05ec6fdedmr9062926eec.28.1766490688057; Tue, 23 Dec 2025 03:51:28 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: lakshmi Date: Tue, 23 Dec 2025 17:23:07 +0530 X-Gm-Features: AQt7F2rl4NRnWj5ItMDFRrniEdaVVslCiTrAPb2c5VqNF2qpLyJh4nVFaEAhFzU Message-ID: Subject: Re: Use log_newpage_range in HASH index build To: Kirill Reshke Cc: pgsql-hackers Content-Type: multipart/alternative; boundary="000000000000f5f67506469d29a0" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000f5f67506469d29a0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi Kirill, I tested your patch on the current master and confirmed the WAL reduction during HASH index build. While testing, I noticed a possible small follow-up improvement in HASH overflow handling. Currently, any free overflow page may be reused, which can scatter overflow chains and hurt cache locality. Reusing recently freed overflow pages first could help, without changing WAL behavior or on-disk format. I would like to work on this as a follow-up enhancement and would welcome any suggestions. Best regards, Lakshmi On Tue, Dec 23, 2025 at 2:31=E2=80=AFPM Kirill Reshke wrote: > There exists an optimization to index creation process, when we omit > to write any WAL > for index build. It is currently supported in B Tree, GIN, GiST, spg > indexes. > It works because we do not need to recover anything if index creation > fails, because if was not used by any query. So, the index can be > built on-disk, and then, just before making the index alive, we can > simply log all pages to WAL. > > Hash index currently lacks this optimization. > PFA implementation. > > During my testing, I checked the amount of WAL generated by index > build before and after patch applied. My script was something like: > > select pg_current_wal_insert_lsn(); > > create index on t using hash (i); > > select pg_current_wal_insert_lsn(); > > select pg_lsn_wal_diff(lsn1, lsn2); > > Resulting numbers depend on index size, but I got 2.5-3.5 times less > WAL with this patch and 8 times less WAL with this patch + > wal_compression=3Don. > Index creation time, however, did not change much... > > About implementation: > These are many types of record that can be generated during index build. > I know for sure these are possible (double-checked using pg_waldump): > > SPLIT_COMPLETE > INSERT > SPLIT_ALLOCATE_PAGE > SPLIT_PAGE > ADD_OVFL_PAGE > SQUEEZE_PAGE > INIT_META_PAGE > INIT_BITMAP_PAGE > > > Looks like SPLIT_COMPLETE and VACUUM_ONE_PAGE are never generated > during index build. I'm not sure about MOVE_PAGE_CONTENTS. > > So, implementation is simply pass isbuild flag everywhere something is > wal-logged. Looks like it is less invasive than alternatives. > > -- > Best regards, > Kirill Reshke > --000000000000f5f67506469d29a0 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

Hi Kiril= l,

I tested your patch on the current m= aster and confirmed the WAL reduction during HASH index build.

While testing, I noticed a possible small follow-u= p improvement in HASH overflow handling. Currently, any free overflow page = may be reused, which can scatter overflow chains and hurt cache locality. R= eusing recently freed overflow pages first could help, without changing WAL= behavior or on-disk format.

I would li= ke to work on this as a follow-up enhancement and would welcome any suggest= ions.

Best regards,
Lakshmi


On Tue, Dec 2= 3, 2025 at 2:31=E2=80=AFPM Kirill Reshke <reshkekirill@gmail.com> wrote:
There exists an optimization to index cre= ation process, when we omit
to write any WAL
for index build. It is currently supported in B Tree, GIN, GiST, spg indexe= s.
It works because we do not need to recover anything if index creation
fails, because if was not used by any query. So, the index can be
built on-disk, and then, just before making the index alive, we can
simply log all pages to WAL.

Hash index currently lacks this optimization.
PFA implementation.

During my testing, I checked the amount of WAL generated by index
build before and after patch applied. My script was something like:

select pg_current_wal_insert_lsn();

create index on t using hash (i);

select pg_current_wal_insert_lsn();

select pg_lsn_wal_diff(lsn1, lsn2);

Resulting numbers depend on index size, but I got 2.5-3.5 times less
WAL with this patch and 8 times less WAL with this patch +
wal_compression=3Don.
Index creation time, however, did not change much...

About implementation:
These are many types of record that can be generated during index build. I know for sure these are possible (double-checked using pg_waldump):

SPLIT_COMPLETE
INSERT
SPLIT_ALLOCATE_PAGE
SPLIT_PAGE
ADD_OVFL_PAGE
SQUEEZE_PAGE
INIT_META_PAGE
INIT_BITMAP_PAGE


Looks like=C2=A0 SPLIT_COMPLETE and VACUUM_ONE_PAGE are never generated
during index build. I'm not sure about=C2=A0 MOVE_PAGE_CONTENTS.

So, implementation is simply pass isbuild flag everywhere something is
wal-logged. Looks like it is less invasive than alternatives.

--
Best regards,
Kirill Reshke
--000000000000f5f67506469d29a0--