Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vP3pW-002L2n-1s for pgsql-hackers@arkaria.postgresql.org; Fri, 28 Nov 2025 19:08:30 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vP3pT-00DCeY-1u for pgsql-hackers@arkaria.postgresql.org; Fri, 28 Nov 2025 19:08:27 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vP3pT-00DCeN-0N for pgsql-hackers@lists.postgresql.org; Fri, 28 Nov 2025 19:08:27 +0000 Received: from mail-qt1-x82e.google.com ([2607:f8b0:4864:20::82e]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1vP3pR-001zbj-05 for pgsql-hackers@postgresql.org; Fri, 28 Nov 2025 19:08:26 +0000 Received: by mail-qt1-x82e.google.com with SMTP id d75a77b69052e-4edb8d6e98aso813161cf.0 for ; Fri, 28 Nov 2025 11:08:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764356905; x=1764961705; darn=postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=HbhubxyLBGzf3Ng7wlsor8h/MaHjGHily3Vu41M3nJs=; b=1vDmDhstwXJRF/WxWzui/+PBhYYVQaNqrOTGqp1sKz8VnDyVlPLqD07aKa2K93u7eM bPB0CbqYQ1k0DECwQURhnfrg50ZAia8Aw+J0ObI8hv4BHgLe/JgeGHcB3c171AQq1G66 BPxZc2jcZcKmy81miPS5OFOxyhkyJgLN5Nvr2TDBOWzP33ZaGLUbO6e2y3MMeugbYHqM dsDVTuaKCHgbAj6+UKSGjQ8dnLT1graFkaVOSNTtWdK1CWU3r9BvfVqvkGx/9lzOzjAB 1BszH9SikAlUBeLaIMQ4N0WYhavQmewSL/DdCEu1jJK/jL159sHboHmKBKd8v71Cb7Hq 0/eA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764356905; x=1764961705; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=HbhubxyLBGzf3Ng7wlsor8h/MaHjGHily3Vu41M3nJs=; b=UKBS2JtwKZimUubn9N56Xad5yDLiR41fnxlZpELJ83M58gn8bg6hEIdPPvx4WxhpGI kPQrZM5EwtkTR9jMzgrBB0wApRXNr0Bo3qImqkF+gfMjEg+YBu4y3+Q9Q97IF/21ScY7 z0gk2QBBOZ690SYbpllQjRYLFc6wbaPs5HHMx5s731V7wBWmC5z98aCo3OkL8liEs207 8ZleRXuYqfsrN/NfR4xCMdWlojw+lYk/sKfECTcSQ2gXnmIMWhXKZ71QYO0plD2nEWtP TG0ZIfxFkYXIjKdFqUIAASRM6Q960jMECbL86+fa2hqqQmEy0SsTZT8Q3j0aqMlO0wQV riyw== X-Forwarded-Encrypted: i=1; AJvYcCWJoL5y8yIxwBka7iGtw6riyaTQksvmCClJCUWz7O8fsqn9BiNl9tTKf1GsJ8d/r0MAjoIRUTm2sccMDt5k@postgresql.org X-Gm-Message-State: AOJu0YxN7xUyFF8IBcKXDWSg2TEPmvwl+JiEcQ6xFU2A5iVteaCiKsnt IglTr5gn/ovtvYFiGUVrnbqr859LhpxBTRvjGfl3ohJfN5KcImgtWL8f4UDzAp6ZGYFV0wOfXkd RRfi9feSBew717AgslF4RQy8mGpnYIl28a2MrLmCl X-Gm-Gg: ASbGnctBjrZB1eyRHV2toaW85MSikq7EEZmKyX2zCVqyyvWwjChtNJXEDBXMJ0aEsFf FKbkh38EMBuJ8TKiVs1+IxYoX1Md1EjDdEWQSTxY0f9anKzMI1qZ7b6tIhHAz4PIi9ZXhKWDdGn n4o9Q5nK4wsW8SLkbJlRliBQrllVnHbmUiHkFOpuKTSWRz3iZN4H4kENjPe+2Qikd204w9b7kTt JGNx89zKobGOjEf/dK5qAZvh0feOKiyu1l6flnMQwd1U2O+PglmxE4H2b/8r3jD0bUJEIyn6dWl w08g8kaiSDNGD9s/73Zyt9yMiGrm X-Google-Smtp-Source: AGHT+IGvcLfrBxRYE/Se2t8rFtm4BHwW3KUbDE+9MfWThiLh6poJDNtsfP9Xvvx2Ti/x+irxDso/dPkF3mkmdLRgEUU= X-Received: by 2002:a05:622a:14cd:b0:4ed:ff77:1a86 with SMTP id d75a77b69052e-4efd0cc52c1mr8806051cf.18.1764356904662; Fri, 28 Nov 2025 11:08:24 -0800 (PST) MIME-Version: 1.0 References: <202505181556.3n6oiowvntyr@alvherre.pgsql> In-Reply-To: From: Hannu Krosing Date: Fri, 28 Nov 2025 20:08:12 +0100 X-Gm-Features: AWmQ_bl5rA_2gfbQUbxbIi1S2JosUCo82zzlMy075_hbSiJ87sjbTzsh_kHSeeU Message-ID: Subject: Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements To: Matthias van de Meent Cc: Mihail Nikalayeu , Sergey Sargsyan , =?UTF-8?Q?=C3=81lvaro_Herrera?= , Andres Freund , Michael Paquier , PostgreSQL Hackers , Andrey Borodin , Melanie Plageman Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Fri, Nov 28, 2025 at 7:31=E2=80=AFPM Matthias van de Meent wrote: > > On Fri, 28 Nov 2025 at 18:58, Hannu Krosing wrote: > > > > On Fri, Nov 28, 2025 at 5:58=E2=80=AFPM Matthias van de Meent > > wrote: > > > > > ... > > > I'm a bit worried, though, that LR may lose updates due to commit > > > order differences between WAL and PGPROC. I don't know how that's > > > handled in logical decoding, and can't find much literature about it > > > in the repo either. > > > > Now the reference to logical decoding made me think that maybe to real > > fix for CIC would be to leverage logical decoding for the 2nd pass of > > CIC and not wore about in-page visibilities at all. > > -1: Requiring the logical decoding system just to reindex an index > without O(tablesize) lock time adds too much overhead, and removes > features we currently have (CIC on unlogged tables). wal_level=3Dlogical > *must not* be required for these tasks if we can at all avoid it. > I'm also not sure whether logical decoding gets access to the HOT > information of the updated tuples involved, and therefore whether the > index build can determine whether it must or can't insert the tuple. There are more and more cases (not just CIC here) where using logical decoding would be the most efficient solution, so why not instead start improving it instead of complicating the system in various places? We could even start selectively logging UNLOGGED and TEMP tables when we start CIC if CIC has enough upsides. > I don't think logical decoding is sufficient, because we don't know > which tuples were already inserted into the index by their own > backends, so we don't know which tuples' index entries we must skip. The premise of pass2 in CIC is that we collect all the rows that were inserted after CIC started for which we are not 100% sure that they are inserted in the index. We can only be sure they are inserted for transactions started after pass1 completed and the index became visible and available for inserts. I am sure that it is possible to avoid inserting duplicate entry (same value and tid) at insert time. And we do not care about hot update chains dusing normal CREATE INDEX or first pass of CIC - we just index what is visible NOW wit no regard of weather the tuple is at the end of HOT update chain. > Kind regards, > > Matthias van de Meent. > > PS. I think the same should be true for REPACK CONCURRENTLY, but > that's a new command with yet-to-be-determined semantics, unlike CIC > which has been part of PG for 6 years. CIC has been around way longer, since 8.2 released in 2006, so more like 20 years :) --- Cheers Hannu