Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wC5kc-001ej5-1P for pgsql-hackers@arkaria.postgresql.org; Mon, 13 Apr 2026 01:06:07 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wC5ka-003r6E-2a for pgsql-hackers@arkaria.postgresql.org; Mon, 13 Apr 2026 01:06:05 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wC5ka-003r66-1H for pgsql-hackers@lists.postgresql.org; Mon, 13 Apr 2026 01:06:05 +0000 Received: from mail-ej1-x635.google.com ([2a00:1450:4864:20::635]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1wC5kY-00000000iX0-2U4Z for pgsql-hackers@postgresql.org; Mon, 13 Apr 2026 01:06:04 +0000 Received: by mail-ej1-x635.google.com with SMTP id a640c23a62f3a-b941762394aso448946566b.1 for ; Sun, 12 Apr 2026 18:06:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1776042360; cv=none; d=google.com; s=arc-20240605; b=TCXH9+tJKf2Hfiwm7iy2Hclx9pVt91/AqOOY4rGd4gFdqFMlyESwVBq0n7Uj4MmAwo JQh4WVScgaKZanAup/KhNOrJ3TgVp/6uCzSuHk8ywgGeNub3RqhFprA+uqyv83gI59YV AaNvMn5vwwnPIn0xpnqPHanP+XfoS16xssL0IEBd5aR+abucre3wpoQG87WF43dZPcPA 8D9q9y42uWK74jqfAC/tiJ6SLBGrO2o7AYad3KsJzXlWvYVHMwhJs1gZkX+2kqocVKRn Gr4IaAI8HiOD7IaEcD+L9myD/AAeH0fp7Bh+mhYW1SDOwZr5jwT8ZB6Fm7xxGjlxbuKW CeqQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:dkim-signature; bh=d3itKthkWSkeZgCW2wU+XqgGrTWd5ITGQrb2v+zFcGk=; fh=k5xkvcglkmDnlWONjyYMvrK6MRJZ0oh7zFE8/C/hfS4=; b=hKgG52QLHqDAvGlCqq9Cz1tKmktwjbyR4FZ0o9Xtj/CSABtneUcSRdjpAzkJjhT8jg GegmaIXXcAkJdSy1Z21W7hq+QG6zr1URAEj2O3ffgRF78kaN2UuW1KgsU4KU1/c/mIwb JDVHdp/dVmBNRdHmwwGTefUifdbDXFQ1//fCRO3D7rGoab9sKxHY7v2zsMrvl32l58aZ rT8b+f6zYtsDhcKEGCZL1pegYyqpeB3tiFkrtS/VUKZyNbx+teHjKjz1Ds1oNFhIslBW AaLMnnsH/ldn0g1SC2LwA7NZ/NJoQ6zcYENKnzjuy0bAVa72ftAClRklvZXvAUmFHHTE Ze8w==; darn=postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1776042360; x=1776647160; darn=postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=d3itKthkWSkeZgCW2wU+XqgGrTWd5ITGQrb2v+zFcGk=; b=HrsPMAKG0zUj25E+JsR3N2PZ2mk8niNI7A3NqaGQbFheG2ElcbOYK1aavJbO25WrfM 9qepnU7jBRoDAI8rSuSUj84Kkd4aaBZnkzm59f++1g3DuJ6kV4Tsr2bMnGdA6DmWn5L+ jisLdR32klv4XEMd/5jrX+eNa1KVvNTvRCuhEVQ1zEHz2sm47d97HzmSRHzEs5T8ekRb OrVNTYvzvI+h5ouPBmogCKx1MdWPO2P1VVxUDOhePGM2Nu7L9grA+luxwj31DOVckOBj 0S5V/4/YmgcuF2qA0U7Tq1qzmmhpvEruEEUVGhpdWQlckWKF3mYsu+/MpaKMMnoXInbG vIzA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1776042360; x=1776647160; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=d3itKthkWSkeZgCW2wU+XqgGrTWd5ITGQrb2v+zFcGk=; b=oszGh9BVjp62ZZtOxj1j+YyXlzGCs9yiAMmQ7Mmwio2z/ONV8Ke1X49wM474T3+3k0 mi/eHSFFEV8OZq9SNNzlBCvNpteIubvnoMng/m1nSTwnRciGr3Jm7kw3Ng5A5JS9koKT 58tBGsi8QwfkfwgM7xghc/fo9AQS/WtPJ47VLi3A3IHv32hMAupC7PCluj7A4xeYVHJj yPbp/5XNRxJSEBB4KcyYpHfR96dWWAYxHfhjhpL7DpZBVX6eito8fgA4DVwcRkmF65kw pIl5+gDdykaEy9qPGTsWexp7j3DxKZlp6OKcpqiWDXgEa5j6bOX8Cf4eJqEIrc+sF/Iq FIyQ== X-Forwarded-Encrypted: i=1; AFNElJ+COcdxRC3A49nBgLNQufi4UKwAwf302vfL+h5X8zVZ//MqrFEL3uKYdAYtE5DTSLUexJ/CfclWIbNAWbDX@postgresql.org X-Gm-Message-State: AOJu0YyNv3hpbxp8GbaGBlhLP2Hxdaot5+256/WtERUpNzMPDfIasmhs fV9p8zadYbPetk1i9YLPbPlv4zsc1BJIxiPQ15mHirMkwCaQGztln0Mth8TZmKsyAaO0bnUnocI sAoIUD398TFi4/8908MVJDLKmCw6QZvU= X-Gm-Gg: AeBDietb2r45/94bqIfrDE94l5gGRcHFtzZujRiberGWOnpDMom2myCPHJ/l2pP6IZh naE0AjI00Q8LmXqeN5ilIijTpm8KY4tTLcivgzv3MhNgu4Pl0OXEOyvMFQUneJeoSytZwr0ECyP sghhFn91N+txLZgAnEFAlHAG0dO1nQfcIq0xvc/Ea+ZmMWXtmu1CLmHGla19eRw7eSSSrxUDWC5 cPTcZ8pPMYhWcTpwykqVz9uhC/m/OgzwtCxmbS0XRmxus+t8v9H1n2MeiI0sxBixkoiJJb+uees yOpR+gW/HanzcN88lZsR/z+Vz1hFtoQfa02I+BQPiA== X-Received: by 2002:a17:906:730e:b0:b8f:848b:4456 with SMTP id a640c23a62f3a-b9d724da10bmr631193766b.13.1776042360018; Sun, 12 Apr 2026 18:06:00 -0700 (PDT) MIME-Version: 1.0 References: <202505181556.3n6oiowvntyr@alvherre.pgsql> <8010.1764584989@localhost> <5778.1764660480@localhost> In-Reply-To: From: Josh Kupershmidt Date: Sun, 12 Apr 2026 21:05:48 -0400 X-Gm-Features: AQROBzCp0ump1tYTwRIyVDZS6EonLZ-87islyWYVyxM_ZY8MM7evxC795VW0X1w Message-ID: Subject: Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements To: Mihail Nikalayeu Cc: Matthias van de Meent , Antonin Houska , Hannu Krosing , Sergey Sargsyan , =?UTF-8?Q?=C3=81lvaro_Herrera?= , Andres Freund , Michael Paquier , PostgreSQL Hackers , Andrey Borodin , Melanie Plageman Content-Type: multipart/alternative; boundary="000000000000f99ab2064f4d1506" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000f99ab2064f4d1506 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, Apr 7, 2026 at 7:20=E2=80=AFPM Mihail Nikalayeu wrote: > Hello, Josh! > > Your review looks a bit LLM-generated, but anyway - thanks for review! :) > Especially because at least one point seems to be valid. > > > We're leaving behind two invalid indexes now that the user has to figur= e > > out how to drop in case of an error - that seems like it could be > confusing for the user. > > Could we have some better way (error handler, background worker) try to > perform this cleanup automatically? > > If not, we should at least tell the user clearly in the error message > that both > > invalid indexes are left behind (i.e. "idx" and "idx_ccaux" in the > example) > > Commit 0005 adds automatic dropping of auxiliary indexes when the > original index is reindexed or dropped. Also, documentation reflects > the ccaux index (similar to ccnew). > Well, we auto-drop the aux index if the user figures out that they should drop the main index first. > > Docs are inconsistent or confusing about whether there's one or two > indexes left behind in case of error > > - e.g. "command will fail but leave behind *an* invalid index and its > associated auxiliary index" > > somewhat implying there is only one invalid index, and somehow the > auxiliary index is valid? > > Auxiliary index is never marked as valid; I'm not sure we need to > highlight it here. Or do you have an idea how to rephrase? > For example in this doc change hunk: @@ -664,12 +665,19 @@ postgres=3D# \d tab col | integer | | | Indexes: "idx" btree (col) INVALID + "idx_ccaux" stir (col) INVALID - The recommended recovery - method in such cases is to drop the index and try again to perform - CREATE INDEX CONCURRENTLY. (Another possibility is - to rebuild the index with REINDEX INDEX CONCURRENTLY). + The recommended recovery method in such cases is to drop the index wit= h + DROP INDEX. The auxiliary index (suffixed with + _ccaux) will be automatically dropped when the main + index is dropped. After dropping the indexes, you can try again to perform + CREATE INDEX CONCURRENTLY. (Another possibility is to + rebuild the index with REINDEX INDEX CONCURRENTLY, + which will also handle cleanup of any invalid auxiliary indexes.) + If the only invalid index is one suffixed _ccaux, + the recommended recovery method is just DROP INDEX + for that index. The output we're showing the user from psql is two INVALID indexes, and we're keeping the original doc suggestion on the first line that "The recommended recovery method in such cases is to drop the index with DROP INDEX". The next sentence clarifies a bit that there's an auxiliary index that "will be automatically dropped". But now it's on the user to figure out which index is which, and drop the right one. > > Similarly, when the doc mentions e.g. "drop the index" - it's not > necessarily clear which index > > we're talking about when there are two invalid indexes left behind that > the user will see with `\d` > > In one commit it says: "method in such cases is to drop these indexes > and try again to perform". > After 0005 "The auxiliary index (suffixed with > _ccaux) will be automatically dropped when the main > index is dropped". > It seems clear to me, but feel free to provide your variant. > > > * It would be nice to guard against users trying arbitrary CREATE INDE= X > ... USING stir(...) with a clear error > > It will fail with "Building STIR indexes is not supported". > Sorry, you are right, this is handled with a good error. > > > One of the testcases (line 2478 of patch 0004) does `DELETE FROM > concur_reindex_tab4 WHERE c1 =3D 1;` > > but the table `concur_reindex_tab4` looks like it has been dropped a fe= w > lines above that? > > Hm, yep, I'll fix it. > > > The StirPageGetFreeSpace macro from patch 0002 reads > `StirPageGetMaxOffset(page)` > > which seems like it could cause an unsafe read of opaque->maxoff if use= d > on the metapage > > But it was never used for the metapage. > Yes, but I think it'd be better not to leave a possible foot-gun around for the next developer. Even just adding an AssertMacro like: #define StirPageGetMaxOffset(page) (AssertMacro(!StirPageIsMeta(page)), StirPageGetOpaque(page)->maxoff) There are some other asserts for some of the trickier bookkeeping that happens in this patch that I think would help check the code, and make it easier to understand as well. E.g. adding an assertion check at the end of StirPageAddItem(), and inside stirbulkdelete() (I tried calling it around L499 of stir.c, just before 'while (itup < itupEnd)'). /* * Validate that maxoff and pd_lower are consistent on a STIR data page. * * On a freshly initialized empty page, pd_lower is SizeOfPageHeaderData * (set by PageInit). After the first insert, pd_lower is computed from * PageGetContents which uses MAXALIGN(SizeOfPageHeaderData). */ static inline void StirPageValidate(Page page) { Assert(!StirPageIsMeta(page)); Assert(StirPageGetOpaque(page)->maxoff =3D=3D 0 ? ((PageHeader) page)->pd_lower =3D=3D SizeOfPageHeaderData : ((PageHeader) page)->pd_lower =3D=3D MAXALIGN(SizeOfPageHeaderData) + StirPageGetOpaque(page)->maxoff * sizeof(StirTuple)); } > > > A comment explains "No predicate evaluation is needed here" , i.e. we > are skipping predicate > > evaluation in the validation scan step, assuming that the > > auxiliary index contains only qualifying TIDs. Is this really > bulletproof for e.g. partial indexes which may > > no longer satisfy the predicate at the time of the validation scan due > to conflicting HOT updates? > > Conflicting HOT updates are not possible because the catalog contains > the new index definition from the start of the process. > Or do you mean a different scenario? > Sorry for the false alarm, I believe you are right - I had to double check RelationGetIndexAttrBitmap(), but I believe this is safe based on the hotblockingattrs bitmap. Overall, this is a nice improvement for CIC/RC that I think should help particularly on large, busy systems. Thanks, Josh --000000000000f99ab2064f4d1506 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


On = Tue, Apr 7, 2026 at 7:20=E2=80=AFPM Mihail Nikalayeu <mihailnikalayeu@gmail.com&= gt; wrote:
Hello= , Josh!

Your review looks a bit LLM-generated, but anyway - thanks for review! :) Especially because at least one point seems to be valid.

> We're leaving behind two invalid indexes now that the user has to = figure
> out how to drop in case of an error - that seems like it could be conf= using for the user.
> Could we have some better way (error handler, background worker) try t= o perform this cleanup automatically?
> If not, we should at least tell the user clearly in the error message = that both
> invalid indexes are left behind (i.e. "idx" and "idx_cc= aux" in the example)

Commit 0005 adds automatic dropping of auxiliary indexes when the
original index is reindexed or dropped. Also, documentation reflects
the ccaux index (similar to ccnew).

Wel= l, we auto-drop the aux index if the user figures out that they should drop= the main index first.
=C2=A0
> Docs are inconsistent or confusing about whether there's one or tw= o indexes left behind in case of error
> - e.g. "command will fail but leave behind *an* invalid index and= its associated auxiliary index"
> somewhat implying there is only one invalid index, and somehow the aux= iliary index is valid?

Auxiliary index is never marked as valid; I'm not sure we need to
highlight it here. Or do you have an idea how to rephrase?
=

For example in this doc change hunk:

@@ -664,12 +665,19 @@ postgres=3D# \d tab
=C2=A0 col =C2=A0 =C2= =A0| integer | =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 | =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0|
=C2=A0Indexes:
=C2=A0 =C2=A0 =C2=A0"idx" btr= ee (col) INVALID
+ =C2=A0 =C2=A0"idx_ccaux" stir (col) INVALID=
=C2=A0</programlisting>
=C2=A0
- =C2=A0 =C2=A0The recommend= ed recovery
- =C2=A0 =C2=A0method in such cases is to drop the index and= try again to perform
- =C2=A0 =C2=A0<command>CREATE INDEX CONCURR= ENTLY</command>. =C2=A0(Another possibility is
- =C2=A0 =C2=A0to r= ebuild the index with <command>REINDEX INDEX CONCURRENTLY</command= >).
+ =C2=A0 =C2=A0The recommended recovery method in such cases is t= o drop the index with
+ =C2=A0 =C2=A0<command>DROP INDEX</comma= nd>. The auxiliary index (suffixed with
+ =C2=A0 =C2=A0<literal>= ;_ccaux</literal>) will be automatically dropped when the main
+ = =C2=A0 =C2=A0index is dropped. After dropping the indexes, you can try agai= n to perform
+ =C2=A0 =C2=A0<command>CREATE INDEX CONCURRENTLY<= /command>. (Another possibility is to
+ =C2=A0 =C2=A0rebuild the inde= x with <command>REINDEX INDEX CONCURRENTLY</command>,
+ =C2= =A0 =C2=A0which will also handle cleanup of any invalid auxiliary indexes.)=
+ =C2=A0 =C2=A0If the only invalid index is one suffixed <literal>= ;_ccaux</literal>,
+ =C2=A0 =C2=A0the recommended recovery method = is just <literal>DROP INDEX</literal>
+ =C2=A0 =C2=A0for tha= t index.
=C2=A0 =C2=A0 </para>

The ou= tput we're showing the user from psql is two INVALID indexes, and we= 9;re keeping the original doc suggestion on the first line that "The r= ecommended recovery method in such cases is to drop the index with DROP IND= EX". The next sentence clarifies a bit that there's an auxiliary i= ndex that "will be automatically dropped". But now it's on th= e user to figure out which index is which, and drop the right one.


> Similarly, when the doc mentions e.g. "drop the index" - it&= #39;s not necessarily clear which index
> we're talking about when there are two invalid indexes left behind= that the user will see with `\d`

In one commit it says: "method in such cases is to drop these indexes<= br> and try again to perform".
After 0005 "The auxiliary index (suffixed with
<literal>_ccaux</literal>) will be automatically dropped when t= he main
index is dropped".
It seems clear to me, but feel free to provide your variant.

>=C2=A0 * It would be nice to guard against users trying arbitrary CREAT= E INDEX ... USING stir(...) with a clear error

It will fail with "Building STIR indexes is not supported".

Sorry, you are right, this is handled with a= good error.
=C2=A0

> One of the testcases (line 2478 of patch 0004) does `DELETE FROM concu= r_reindex_tab4 WHERE c1 =3D 1;`
> but the table `concur_reindex_tab4` looks like it has been dropped a f= ew lines above that?

Hm, yep, I'll fix it.

> The StirPageGetFreeSpace macro from patch 0002 reads `StirPageGetMaxOf= fset(page)`
> which seems like it could cause an unsafe read of opaque->maxoff if= used on the metapage

But it was never used for the metapage.

Yes, but I think it'd be better not to leave a possible foot-gun aroun= d for the next developer. Even just adding an AssertMacro like:
<= br>
#define StirPageGetMaxOffset(page) (AssertMacro(!StirPageIsMe= ta(page)), StirPageGetOpaque(page)->maxoff)

There are some other asserts for some of the trickier bookkeeping that hap= pens in this patch that I think would help check the=C2=A0code, and make it= easier to understand as well. E.g. adding an assertion check at the end of= StirPageAddItem(), and inside stirbulkdelete() (I tried calling it around = L499 of stir.c, just before 'while (itup < itupEnd)').=C2=A0

/*
=C2=A0* Validate that maxoff and pd_lower are c= onsistent on a STIR data page.
=C2=A0*
=C2=A0* On a freshly initializ= ed empty page, pd_lower is SizeOfPageHeaderData
=C2=A0* (set by PageInit= ).=C2=A0 After the first insert, pd_lower is computed from
=C2=A0* PageG= etContents which uses MAXALIGN(SizeOfPageHeaderData).
=C2=A0*/
=
static inline void
StirPageValidate(Page page)
{
Assert(!Sti= rPageIsMeta(page));
Assert(StirPageGetOpaque(page)->maxoff =3D=3D 0<= br> =C2=A0 ? ((PageHeader) page)->pd_lower =3D=3D SizeOfPageHeaderData=
=C2=A0 : ((PageHeader) page)->pd_lower =3D=3D
=C2=A0 =C2=A0= MAXALIGN(SizeOfPageHeaderData) +
=C2=A0 =C2=A0 StirPageGetOpaque(pag= e)->maxoff * sizeof(StirTuple));
}
=C2=A0

> A comment explains "No predicate evaluation is needed here" = , i.e. we are skipping predicate
> evaluation in the validation scan step, assuming that the
> auxiliary index contains only qualifying TIDs. Is this really bulletpr= oof for e.g. partial indexes which may
> no longer satisfy the predicate at the time of the validation scan due= to conflicting HOT updates?

Conflicting HOT updates are not possible because the catalog contains
the new index definition from the start of the process.
Or do you mean a different scenario?

So= rry for the false alarm, I believe you are right - I had to double check=C2= =A0RelationGetIndexAttrBitmap(), but I believe this is safe based on the=C2= =A0hotblockingattrs=C2=A0bitmap.=C2=A0

Overall, th= is is a nice improvement for CIC/RC that I think should help particularly o= n large, busy systems.

Thanks,
Josh
=C2=A0
--000000000000f99ab2064f4d1506--