Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1nriDs-0002fj-MV for pgsql-docs@arkaria.postgresql.org; Thu, 19 May 2022 15:37:56 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.92) (envelope-from ) id 1nriDr-0005I6-Iy for pgsql-docs@arkaria.postgresql.org; Thu, 19 May 2022 15:37:55 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1nriDr-0005Hw-6L for pgsql-docs@lists.postgresql.org; Thu, 19 May 2022 15:37:55 +0000 Received: from mail-ej1-x636.google.com ([2a00:1450:4864:20::636]) by makus.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1nriDo-0001A0-DF for pgsql-docs@lists.postgresql.org; Thu, 19 May 2022 15:37:54 +0000 Received: by mail-ej1-x636.google.com with SMTP id ch13so10657021ejb.12 for ; Thu, 19 May 2022 08:37:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cybertec-at.20210112.gappssmtp.com; s=20210112; h=message-id:subject:from:to:date:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=KUs6KwdBF7dUuZsR2MdZ8x8QV2pZydgdOxJQXfK+wyg=; b=sWTZHwM9SepYnhP/w+YDrCsA3GlN2dDRe/R79s+yx8xjNh8HIY4eSkA3jF2uuipHpv DunahJoQiupeHRkX5c6Kv/wee+r6VTmseAqgHoAn7MS1WTCWC5LH+DScOd+7K/+NklOD s/zL/B5BKj+BxZlxdfFjXE4WAIthP8xc2UccLSS/rGNHEYFJfScjEXAMvjC8N8+m2LEy pDGLYoXdZed4EBlBY1AhTrrrsF0BxdPUwMgohvP8V0ZhthboaMVpjzOHfn0CVRhqk/57 pwBCHnQH+Bpe2PXhgzq9VGjywTN/zkWxIqI3+7JiO6TZKcBhEgR+6R5j+lje3O1w5D/O Kz9w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:subject:from:to:date:in-reply-to :references:user-agent:mime-version:content-transfer-encoding; bh=KUs6KwdBF7dUuZsR2MdZ8x8QV2pZydgdOxJQXfK+wyg=; b=khPo8MVedjmWvZWLxZzTVbvIkrgXhpVIURnvJnpUk+NaJWRgudrN1hgpNaXySu1Gvz yK4IDWYfy7GhmKR9BmsEoI6We60b6AX9xQvOZ6UU4SOC+piY7dwM8ZFhV1PM8WHORx8s F0WGgc1qh6qjyC/invMnGzSVaC7lx7O1+2UdXZFtExkNW6g/cMxznboPncOrLjjfZKkH 6Qz8uOKCL/vjehWBDgw2EM8iwNyiumwAZAdJy23/FZliwY9ho1aohJZmM0ti+pHl2Vrf +Jh0lswo3K9exWk0yYQ9vkJzJIKL/U0ZZ0RKYG0TxgTUZkdweC8Q8L9NNScQKEs8E9k6 /dgA== X-Gm-Message-State: AOAM5324yN4TZ8CX6YyDTTtuvUAHFLf/iWezrJu9ZBT03e1c7bwLT/dn oW0iaQ+VqL+Znj3ukOVzEnrgkA== X-Google-Smtp-Source: ABdhPJzvierXfa/K1s25oZCsvIxRxqlik6tOZCEhVkhH3M41pbb+qpQR0BWuICQK8MBZiXOlQBGHOw== X-Received: by 2002:a17:907:728b:b0:6f9:a9bf:a8e7 with SMTP id dt11-20020a170907728b00b006f9a9bfa8e7mr4959415ejc.696.1652974670395; Thu, 19 May 2022 08:37:50 -0700 (PDT) Received: from dynamic-pd01.res.v6.highway.a1.net ([2001:871:5e:1f41:c34f:9eb1:7394:db5a]) by smtp.gmail.com with ESMTPSA id zp26-20020a17090684fa00b006f3ef214e37sm2210226ejb.157.2022.05.19.08.37.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 19 May 2022 08:37:49 -0700 (PDT) Message-ID: Subject: Re: INVALID index while concurrent indexing in progress? From: Laurenz Albe To: Rajakavitha Kodhandapani , dancernerd32@gmail.com, pgsql-docs@lists.postgresql.org Date: Thu, 19 May 2022 17:37:48 +0200 In-Reply-To: References: <165290238488.670.7500177735573254738@wrigleys.postgresql.org> Content-Type: text/plain; charset="UTF-8" User-Agent: Evolution 3.42.4 (3.42.4-2.fc35) MIME-Version: 1.0 Content-Transfer-Encoding: 8bit List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Thu, 2022-05-19 at 19:02 +0530, Rajakavitha Kodhandapani wrote: > > I think the INVALID index can use further explanation, in particular, from > > my experience it seems like when building an index concurrently, the index > > gets inserted and labeled invalid while the index is being built, and then > > the label gets removed if it finishes successfully or gets left on the index > > if there is a failure while building the index.  It is my current > > understanding, after experimenting, that INVALID means 'incomplete', whether > > that's because it's in progress or because it was unable to be completed, > > but prior to my experiment my understanding was that INVALID indicated > > failure. > > > > This was especially confusing when we were adding an index to a very large > > table because we assumed the INVALID index indicated failure when we > > couldn't find any other sign of progress or failure. > > This is my first attempt at contributing to the documentation of PostgreSQL. > Here's the patch. Please let me know if any other changes need to be made. Thank you! Please send patches as plain text and use bottom posting. > \cf3 @@ -665,11 +665,14 @@\cf5 Indexes:\ > \ > Another caveat when building a unique index concurrently is that the\ > uniqueness constraint is already being enforced against other transactions\ > \cf6 - when the second table scan begins. This means that constraint violations\cf5 \ > \cf4 + when the second table scan begins. This means that constraint violations\cf5 \ > could be reported in other queries prior to the index becoming available\ > \cf6 - for use, or even in cases where the index build eventually fails. Also,\cf5 \ > \cf6 - if a failure does occur in the second scan, the invalid index\cf5 \ > \cf6 - continues to enforce its uniqueness constraint afterwards.\cf5 \ > \cf4 + for use, or even in cases where the index build eventually fails. The index\cf5 \ > \cf4 + is inserted and labeled invalid while the index is being built,\cf5 \ > \cf4 + and then the label is removed if the index builds successfully. If the index does\cf5 \ > \cf4 + not build successfully, then the label invalid remains. Also, if a\cf5 \ > } I don't think that this information should be added to a paragraph that focuses on uniqueness checks in concurrent index builds. Actually, most of the information is already there. To quote from the page: If a problem arises while scanning the table, such as a deadlock or a uniqueness violation in a unique index, the CREATE INDEX command will fail but leave behind an “invalid” index. How about the following patch to emphasize the role of "invalid": diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml index d3102a87d9..fee2c61e5e 100644 --- a/doc/src/sgml/ref/create_index.sgml +++ b/doc/src/sgml/ref/create_index.sgml @@ -622,7 +622,7 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] - In a concurrent index build, the index is actually entered into + In a concurrent index build, the index is actually entered as invalid index into the system catalogs in one transaction, then two table scans occur in two more transactions. Before each table scan, the index build must wait for existing transactions that have modified the table to terminate. @@ -631,7 +631,7 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] valid and ready for use, and the CREATE INDEX command terminates. Even then, however, the index may not be immediately usable for queries: in the worst case, it cannot be used as long as transactions exist that Yours, Laurenz Albe