Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vP5aj-003pco-2Q for pgsql-hackers@arkaria.postgresql.org; Fri, 28 Nov 2025 21:01:21 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vP5ai-00DWxB-0w for pgsql-hackers@arkaria.postgresql.org; Fri, 28 Nov 2025 21:01:20 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vP5ah-00DWx3-2i for pgsql-hackers@lists.postgresql.org; Fri, 28 Nov 2025 21:01:20 +0000 Received: from mail-qt1-x82b.google.com ([2607:f8b0:4864:20::82b]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1vP5ae-0020QI-34 for pgsql-hackers@postgresql.org; Fri, 28 Nov 2025 21:01:19 +0000 Received: by mail-qt1-x82b.google.com with SMTP id d75a77b69052e-4ee243b98caso586401cf.1 for ; Fri, 28 Nov 2025 13:01:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764363677; x=1764968477; darn=postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=3I3qQ7sMNb95ChhtKCntwVPv8B6BKNd8+Xiog7bdn60=; b=yjbZpZyjl4Nw8dj7ZAHUWAqEX31rjBdI/tXEUw9ccxVx4pmbUg3/H9IW1tUd0NH4Fv k3xRZvcX0Rybuoaaq5Im8b3o7tCJbbh8bpcayHOjIQyb4g3sv2NVrAY327VJWpYhly3R anxOz+9H6Fafq9dQMBco9RSDpqfF37zEodyUl4zxzeekBwDxXTRzJUe0GLv5ZTFwc2Mk 6lSWXdGdr8K1obasLvblpVaukP3T4m5i79gqqcx+IUvupdvr2xbDUUGiBlW+tEq5nu+2 8CfysADMC4XPEWEnf3RWiE6MQ5yhO+AWtwtEFUgcmSVJ+SEkqqhjOhS7nraOX97wcBPs sieQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764363677; x=1764968477; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=3I3qQ7sMNb95ChhtKCntwVPv8B6BKNd8+Xiog7bdn60=; b=LvVXmWOz6h2G43OIrRG3lA6AraLs7aCKtVSd3PXvBfxb0zrtuJN+3xHejQpe83sEx2 vpBaiWsrnTO4I+nN7I+sAJTKlPkZDOrvFR8VWj+oH3kaEz0lbfduXQpw/0/JIFaPAtzw 7l3zaUO2Pq9pNZf8k91L+OQFw9L6jvDCloiUICuA6ihm+KDxkAfZketzJCi8HWXCbKlx G6DgyiLeL8l+8CNB1QGWiupRTluAvaZP9+gI71AFhK5zsXzQcZ1jwryc1FQvWKbEKLBU 9NNn66SezHgKSos6XLM4a0a3acnf/3ofUVUBi+qOFGuolG8MUbA2665HzBZD4Hd+aYe4 zvDw== X-Forwarded-Encrypted: i=1; AJvYcCVbwZNavliefIW2yaRol8CxgNKnTm+eOpLuSFYUffa2rR7WctYNMWjeN6Zm+8lL3jsikxbhEgxIIFyGeJB1@postgresql.org X-Gm-Message-State: AOJu0YxEBp2NpXWDjnM/zfXbUX31RajywoxYjgLj9VRobNAsWtTduGJK q2UQyy0S/RxEtw9VnWIesVWZdoK3QlavN/LBJXaBQkVhXy87fjOTGd+Nnn/fYur8JPgwkBDAfM8 OO72kCCXRPzNXSy9cGBVN19uQXQAKWmNPrWdE+crP X-Gm-Gg: ASbGncsd9nf4s1rkXN6zHzraa/kBwQx09REEh7sj0MOpV0OKUulQaIwfle2m65/CbrR Yb80lYiLKWKLNLu3pZZXWukjgUcHD9TqFIyZ7KsV0IjQaLKXp79Yxjl9wAMATOfjZEF524mvOjH gntUlwMSXUt4KoZue6Mo41q3OrawLhG5pyPPuEaAKJrsqqoA4R1vrOWk2W1Bb1O5QC0TXOud4lK aBOjEd/N/PUeJYyS4xENvw+Jk0aC3sXUxsvBA9CN4j7i08p6TV1jZnl3iWFQIWPlCoKjgMeFmN9 bQNSavwfNrnEk8HGkKKjH5Wk75oS X-Google-Smtp-Source: AGHT+IFX3Zt1wTcJ21UBcp4MPOl7dh8/QrkDTVZ3ZxMosJHljrz9Si2D9aS7VUrcQYG/2RIMnOTjC1FRlEefwKlAxwg= X-Received: by 2002:a05:622a:346:b0:4ed:b665:3779 with SMTP id d75a77b69052e-4efd0cadedemr9823301cf.16.1764363676152; Fri, 28 Nov 2025 13:01:16 -0800 (PST) MIME-Version: 1.0 References: <202505181556.3n6oiowvntyr@alvherre.pgsql> In-Reply-To: From: Hannu Krosing Date: Fri, 28 Nov 2025 22:01:04 +0100 X-Gm-Features: AWmQ_bkM3jFLwFFdD5D4BeW-tPS36__qnpG0wLgwV8_YA0JfZ5hMba8GD_gRQZM Message-ID: Subject: Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements To: Mihail Nikalayeu Cc: Matthias van de Meent , Sergey Sargsyan , =?UTF-8?Q?=C3=81lvaro_Herrera?= , Andres Freund , Michael Paquier , PostgreSQL Hackers , Andrey Borodin , Melanie Plageman , Antonin Houska Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Fri, Nov 28, 2025 at 9:42=E2=80=AFPM Mihail Nikalayeu wrote: > > Hi, Hannu! > > I think you pressed "Reply" instead of "Reply All" - so, I put it to > the list (looks like nothing is secret here). > Mostly it is because of my opinion at the end of the mail which I want > to share with the list. Thanks, and yes, it was meant for the list. > On Fri, Nov 28, 2025 at 8:33=E2=80=AFPM Hannu Krosing = wrote: > > If it is an *index AM* then this may not solve HOT chains issue (see > > below), if we put it on top of *table AM* as some kind of pass-through > > collector then likely yes, though you may still want to do final sort > > in commit order to know which one is the latest version of updated > > tuples which needs to go in the index. The latter is not strictly > > needed, but would be a nice optimisation for oft-updated rows. > > It is AM which is added as an index (with the same > columns/expressions/predicates) to the table before phase 1 starts. > So, all new tuples are inserted into it. > > > And I would not collect just TID, but also the indexes value, as else > > we end up accessing the table in some random order for getting the > > value (and possibly do visibility checks) > Just TIDs - it is ordered at validation phase (while merging with an > main index) and read using AIO - pretty fast. It is a space vs work compromise - you either collect it at once or have to read it again later. Even pretty fast is still slower than doing nothing :) > > I am not sure where we decide that tuple is HOT-updatable, but I > > suspect that it is before we call any index AMs, so STIR ios not > > guaranteed to solve the issues with HOT chains. > > I am not sure what the HOT-chains issue is, but it actually works > correctly already, including stress tests. > It is even merged into one commercial fork of PG (I am not affiliated > with it in any way). It was about a simplistic approach for VACUUM to just ignore the CIC backends and then missing some inserts. > > (And yes, I have a patch in works to include old and new tids> as part > > of logical decoding - they are "almost there", just not passed through > > - which would help here too to easily keep just the last value) > > Yes, at least it is required for the REPACK case. > > But.... > > Antonin already has a prototype of patch to enable logical decoding > for all kinds of tables in [0] (done in scope of REPACK). > > So, if we have such mechanics in place, it looks nice (and almost the > same) for both CIC and REPACK: > * in both cases we create temporary slot to collect incoming tuples > * in both cases scan the table resetting snapshot every few pages to > keep xmin horizon propagate > * in both cases the process already collected part every few megabytes > * just the logic of using collected tuples is different... > > So, yes, from terms of effectiveness STIR seems to be better, but such > a common approach like LD looks tempting to have for both REPACK/CIC. My reasoning was mainly that using something that already exists, and must work correctly in any case, is a better long-term strategy than adding complexity in multiple places. After looking up when CIC appeared (v 8.2) and when logical decoding came along (v9.4) I start to think that CIC probably would have used LD if it had been available when CIC was added. > On Fri, Nov 28, 2025 at 5:58=E2=80=AFPM Matthias van de Meent > wrote: > > -1: Requiring the logical decoding system just to reindex an index > > without O(tablesize) lock time adds too much overhead, > > How big is the additional cost of maintaining logical decoding for a > table? Could you please evolve a little bit? > > Best regards, > Mikhail. > > > [0]: https://www.postgresql.org/message-id/152010.1751307725%40localhost > (v15-0007-Enable-logical-decoding-transiently-only-for-REPACK-.patch)