Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1nrj3p-0004qd-13 for pgsql-docs@arkaria.postgresql.org; Thu, 19 May 2022 16:31:37 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.92) (envelope-from ) id 1nrj3n-0000cj-58 for pgsql-docs@arkaria.postgresql.org; Thu, 19 May 2022 16:31:35 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1nrj3m-0000ca-E1 for pgsql-docs@lists.postgresql.org; Thu, 19 May 2022 16:31:34 +0000 Received: from mail-pl1-x630.google.com ([2607:f8b0:4864:20::630]) by makus.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1nrj3j-0001ZR-9O for pgsql-docs@lists.postgresql.org; Thu, 19 May 2022 16:31:33 +0000 Received: by mail-pl1-x630.google.com with SMTP id q18so5254941pln.12 for ; Thu, 19 May 2022 09:31:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=6Wbi8dSYXe9VQkgmWgP8DPOA7zFdAw88S2paqQ11wSE=; b=Ls0VB1Petv8tHB1yvxiu+UJ4NNLK3sJQ4D1Zl8kmLPChukMUKAgXuySSaV07wskL0K Yj6jGFY/CwsPLiel0vtAwDmU6J3+dbUQRRB3iyDzvll/cas2L2LkDDvg0+H5v5e+1IYt t6HW7wL+q+q6Pcm38VvmorU4TrvpqEbSB4BElFL2i3lEp8ZfG/sDni8onqNNP62Ofady nfRNhZ3lE+24K9+d2cCL2hYQDJJkk3dWswLYuFiRGgjpvxvyPasxiwQAUSqPVxY1scD1 v/SSvsJ61rVkvnpQTmooQAAhA9eEC/muBWKqu8HrWr04+6ViLTIp6ja9NJu5G1xKQc/i acFA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=6Wbi8dSYXe9VQkgmWgP8DPOA7zFdAw88S2paqQ11wSE=; b=I4R2PS512ERe09MHD8WGfiyRPnqzcJObmqtQs4WxJTw9UWnZnGgHrWbFLCh/HGG2fz Id3Ny9StdVFqANyKf3MwR4H75n+NK1hxRf1Jp5YDA8qeGg0ugUSXbwAoDx79w+Ht4DLQ Z1uNCSEFKtEOFlAD8sPcOR0Ehluum7wkiYwzrie7aFtUiK4pefvBRXJh/dTRe0xxxebf pS6nx7+Acie6//TcDWxHEZ/Ehc6SndczppIn4rzAg5j3tNtv34nHw4TB8H0cVQC2Lg1c To+0qU06MRL67/9JEfOklyPpzl6pcFT7rs8daUeQYTyArnNJJPsxLDFBy2qwqbGWjllZ o9xw== X-Gm-Message-State: AOAM531KJQlODEsp176c48HYHHz+/q+9yYltq5/tfjpX8mDZVzC8ehjJ aJz/PhS1xnlsLFVLBhrR7wdaLjGuROidxKPKyL4= X-Google-Smtp-Source: ABdhPJx4c4k3OZjGcZ5N05x6N+Atfs5Ak3Hp/NCGpjQZAqLuiiK24gQbUgveWEfypSyt4BgPxq1hugbb5r9AB6kzOrc= X-Received: by 2002:a17:90a:8b8f:b0:1df:14f4:b3d3 with SMTP id z15-20020a17090a8b8f00b001df14f4b3d3mr6581104pjn.163.1652977889998; Thu, 19 May 2022 09:31:29 -0700 (PDT) MIME-Version: 1.0 References: <01987150-EDDC-4567-944C-370F46AE9807@gmail.com> In-Reply-To: <01987150-EDDC-4567-944C-370F46AE9807@gmail.com> From: Rajakavitha Kodhandapani Date: Thu, 19 May 2022 22:01:18 +0530 Message-ID: Subject: Re: INVALID index while concurrent indexing in progress? To: Lauren Fliksteen Cc: Laurenz Albe , pgsql-docs@lists.postgresql.org Content-Type: multipart/alternative; boundary="000000000000f4798d05df5fe7bc" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000f4798d05df5fe7bc Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Thank you, Laurenz. The changes that you suggested make a lot more sense. I will make the updates and submit the changes. Regards, Rajie On Thu, May 19, 2022 at 9:45 PM Lauren Fliksteen wrote: > Thank you both! I think Laurenz=E2=80=99s changes make perfect sense! > > Sent from my iPhone > > > On May 19, 2022, at 8:37 AM, Laurenz Albe > wrote: > > > > =EF=BB=BFOn Thu, 2022-05-19 at 19:02 +0530, Rajakavitha Kodhandapani wr= ote: > >>> I think the INVALID index can use further explanation, in particular, > from > >>> my experience it seems like when building an index concurrently, the > index > >>> gets inserted and labeled invalid while the index is being built, and > then > >>> the label gets removed if it finishes successfully or gets left on th= e > index > >>> if there is a failure while building the index. It is my current > >>> understanding, after experimenting, that INVALID means 'incomplete', > whether > >>> that's because it's in progress or because it was unable to be > completed, > >>> but prior to my experiment my understanding was that INVALID indicate= d > >>> failure. > >>> > >>> This was especially confusing when we were adding an index to a very > large > >>> table because we assumed the INVALID index indicated failure when we > >>> couldn't find any other sign of progress or failure. > >> > >> This is my first attempt at contributing to the documentation of > PostgreSQL. > >> Here's the patch. Please let me know if any other changes need to be > made. > > > > Thank you! Please send patches as plain text and use bottom posting. > > > >> \cf3 @@ -665,11 +665,14 @@\cf5 Indexes:\ > >> \ > >> Another caveat when building a unique index concurrently is that > the\ > >> uniqueness constraint is already being enforced against other > transactions\ > >> \cf6 - when the second table scan begins. This means that > constraint violations\cf5 \ > >> \cf4 + when the second table scan begins. This means that constrain= t > violations\cf5 \ > >> could be reported in other queries prior to the index becoming > available\ > >> \cf6 - for use, or even in cases where the index build eventually > fails. Also,\cf5 \ > >> \cf6 - if a failure does occur in the second scan, the > invalid index\cf5 \ > >> \cf6 - continues to enforce its uniqueness constraint > afterwards.\cf5 \ > >> \cf4 + for use, or even in cases where the index build eventually > fails. The index\cf5 \ > >> \cf4 + is inserted and labeled invalid while the > index is being built,\cf5 \ > >> \cf4 + and then the label is removed if the index builds > successfully. If the index does\cf5 \ > >> \cf4 + not build successfully, then the label invalid > remains. Also, if a\cf5 \ > >> } > > > > I don't think that this information should be added to a paragraph that > > focuses on uniqueness checks in concurrent index builds. > > > > Actually, most of the information is already there. To quote from the > page: > > > > If a problem arises while scanning the table, such as a deadlock or a > uniqueness violation > > in a unique index, the CREATE INDEX command will fail but leave behind > an =E2=80=9Cinvalid=E2=80=9D index. > > > > How about the following patch to emphasize the role of "invalid": > > > > diff --git a/doc/src/sgml/ref/create_index.sgml > b/doc/src/sgml/ref/create_index.sgml > > index d3102a87d9..fee2c61e5e 100644 > > --- a/doc/src/sgml/ref/create_index.sgml > > +++ b/doc/src/sgml/ref/create_index.sgml > > @@ -622,7 +622,7 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT > EXISTS ] > > > > > > > - In a concurrent index build, the index is actually entered into > > + In a concurrent index build, the index is actually entered as > invalid index into > > the system catalogs in one transaction, then two table scans occur = in > > two more transactions. Before each table scan, the index build mus= t > > wait for existing transactions that have modified the table to > terminate. > > @@ -631,7 +631,7 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT > EXISTS ] > scan to terminate, including transactions used by any phase of > concurrent > > index builds on other tables, if the indexes involved are partial o= r > have > > columns that are not simple column references. > > - Then finally the index can be marked ready for use, > > + Then finally the index can be marked valid and read= y > for use, > > and the CREATE INDEX command terminates. > > Even then, however, the index may not be immediately usable for > queries: > > in the worst case, it cannot be used as long as transactions exist > that > > > > Yours, > > Laurenz Albe > --000000000000f4798d05df5fe7bc Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Thank you, Laurenz. The changes that you suggested make a = lot more sense.
I will make the updates and submit the changes.

Regards,
Rajie

On Thu, = May 19, 2022 at 9:45 PM Lauren Fliksteen <dancernerd32@gmail.com> wrote:
Thank you both! I think Laurenz=E2=80=99s= changes make perfect sense!

Sent from my iPhone

> On May 19, 2022, at 8:37 AM, Laurenz Albe <laurenz.albe@cybertec.at> wrot= e:
>
> =EF=BB=BFOn Thu, 2022-05-19 at 19:02 +0530, Rajakavitha Kodhandapani w= rote:
>>> I think the INVALID index can use further explanation, in part= icular, from
>>> my experience it seems like when building an index concurrentl= y, the index
>>> gets inserted and labeled invalid while the index is being bui= lt, and then
>>> the label gets removed if it finishes successfully or gets lef= t on the index
>>> if there is a failure while building the index.=C2=A0 It is my= current
>>> understanding, after experimenting, that INVALID means 'in= complete', whether
>>> that's because it's in progress or because it was unab= le to be completed,
>>> but prior to my experiment my understanding was that INVALID i= ndicated
>>> failure.
>>>
>>> This was especially confusing when we were adding an index to = a very large
>>> table because we assumed the INVALID index indicated failure w= hen we
>>> couldn't find any other sign of progress or failure.
>>
>> This is my first attempt at contributing to the documentation of P= ostgreSQL.
>> Here's the patch. Please let me know if any other changes need= to be made.
>
> Thank you!=C2=A0 Please send patches as plain text and use bottom post= ing.
>
>> \cf3 @@ -665,11 +665,14 @@\cf5=C2=A0 Indexes:\
>>=C2=A0 =C2=A0 <para>\
>>=C2=A0 =C2=A0 =C2=A0Another caveat when building a unique index con= currently is that the\
>>=C2=A0 =C2=A0 =C2=A0uniqueness constraint is already being enforced= against other transactions\
>> \cf6 -=C2=A0 =C2=A0 when the second table scan begins.=C2=A0 This = means that constraint violations\cf5 \
>> \cf4 +=C2=A0 =C2=A0 when the second table scan begins. This means = that constraint violations\cf5 \
>>=C2=A0 =C2=A0 =C2=A0could be reported in other queries prior to the= index becoming available\
>> \cf6 -=C2=A0 =C2=A0 for use, or even in cases where the index buil= d eventually fails.=C2=A0 Also,\cf5 \
>> \cf6 -=C2=A0 =C2=A0 if a failure does occur in the second scan, th= e <quote>invalid</quote> index\cf5 \
>> \cf6 -=C2=A0 =C2=A0 continues to enforce its uniqueness constraint= afterwards.\cf5 \
>> \cf4 +=C2=A0 =C2=A0 for use, or even in cases where the index buil= d eventually fails. The index\cf5 \
>> \cf4 +=C2=A0 =C2=A0 is inserted and labeled <quote>invalid&l= t;/quote> while the index is being built,\cf5 \
>> \cf4 +=C2=A0 =C2=A0 and then the label is removed if the index bui= lds successfully. If the index does\cf5 \
>> \cf4 +=C2=A0 =C2=A0 not build successfully, then the label <quo= te>invalid</quote> remains. Also, if a\cf5 \
>> }
>
> I don't think that this information should be added to a paragraph= that
> focuses on uniqueness checks in concurrent index builds.
>
> Actually, most of the information is already there.=C2=A0 To quote fro= m the page:
>
>=C2=A0 If a problem arises while scanning the table, such as a deadlock= or a uniqueness violation
>=C2=A0 in a unique index, the CREATE INDEX command will fail but leave = behind an =E2=80=9Cinvalid=E2=80=9D index.
>
> How about the following patch to emphasize the role of "invalid&q= uot;:
>
> diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/cre= ate_index.sgml
> index d3102a87d9..fee2c61e5e 100644
> --- a/doc/src/sgml/ref/create_index.sgml
> +++ b/doc/src/sgml/ref/create_index.sgml
> @@ -622,7 +622,7 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NO= T EXISTS ] <replaceable class=3D
>=C2=A0 =C2=A0 </para>
>
>=C2=A0 =C2=A0 <para>
> -=C2=A0 =C2=A0 In a concurrent index build, the index is actually ente= red into
> +=C2=A0 =C2=A0 In a concurrent index build, the index is actually ente= red as <quote>invalid</quote> index into
>=C2=A0 =C2=A0 =C2=A0the system catalogs in one transaction, then two ta= ble scans occur in
>=C2=A0 =C2=A0 =C2=A0two more transactions.=C2=A0 Before each table scan= , the index build must
>=C2=A0 =C2=A0 =C2=A0wait for existing transactions that have modified t= he table to terminate.
> @@ -631,7 +631,7 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NO= T EXISTS ] <replaceable class=3D
>=C2=A0 =C2=A0 =C2=A0scan to terminate, including transactions used by a= ny phase of concurrent
>=C2=A0 =C2=A0 =C2=A0index builds on other tables, if the indexes involv= ed are partial or have
>=C2=A0 =C2=A0 =C2=A0columns that are not simple column references.
> -=C2=A0 =C2=A0 Then finally the index can be marked ready for use,
> +=C2=A0 =C2=A0 Then finally the index can be marked <quote>valid= </quote> and ready for use,
>=C2=A0 =C2=A0 =C2=A0and the <command>CREATE INDEX</command>= command terminates.
>=C2=A0 =C2=A0 =C2=A0Even then, however, the index may not be immediatel= y usable for queries:
>=C2=A0 =C2=A0 =C2=A0in the worst case, it cannot be used as long as tra= nsactions exist that
>
> Yours,
> Laurenz Albe
--000000000000f4798d05df5fe7bc--