Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vVd4a-005jlW-3C for pgsql-hackers@arkaria.postgresql.org; Tue, 16 Dec 2025 21:59:13 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vVd4Z-008maA-1n for pgsql-hackers@arkaria.postgresql.org; Tue, 16 Dec 2025 21:59:12 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vVd4Z-008mZy-0N for pgsql-hackers@lists.postgresql.org; Tue, 16 Dec 2025 21:59:12 +0000 Received: from mail-lf1-x131.google.com ([2a00:1450:4864:20::131]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1vVd4Y-0012VM-1c for pgsql-hackers@postgresql.org; Tue, 16 Dec 2025 21:59:11 +0000 Received: by mail-lf1-x131.google.com with SMTP id 2adb3069b0e04-597c83bb5c2so4488430e87.3 for ; Tue, 16 Dec 2025 13:59:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1765922348; x=1766527148; darn=postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=bHRvdEQIYzGcw60huv9p+37i+6TFfplLqz+08v15Nec=; b=bfUyoJv/yBDf0Y+yNTLgtebC1Z5Vhzf8jizyFLhdKl1dmWRmmPNVo+SeaTHk+cjgZW hnS+/skDRkIr1tylu5L87ErZbxrNfDRlGJQyheVol26je8VqMboTZRl3Lnzt16mBYiw8 HCaZsq/FRVoURYswurrv0qA5PTWoT0XZCyl4S+xn6NUPtR2jNSpjgaLD7/8OkRmSOe8/ 9JS0MPkDfWdZ7QvSDXXFoNkWSbLUIkplt3shNrYReq4C9qjObFhweiZypvu8ESJJYGWo yURZAR1Iv5UAWNpHj7Z4DySH8bwMA5mXx5jgJx7ZMc9A5twDHUT44Usoyv+09ZOrHedD BsqA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1765922348; x=1766527148; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=bHRvdEQIYzGcw60huv9p+37i+6TFfplLqz+08v15Nec=; b=WJjgKqhQ3ViHoLXVmzjjWef4rra3nzJ27ze7PiWcYGxsYsAUOtERu3hz9l50X2T1HG FwgKKB5jD+/nKr6IlRZW8PIHbKc0xd53V2s66LuqPnqutfEKJM2Eu9N9DRqJ/kX48xdl N6gcBxKkiFZHM0zaRVb1tiVUIAmzJ8vtpG8X4iWLNw3jp6t/5PM805rFPvEl9fDI9ss9 7D2Zpc92IpRABOBwDdCWjOFf/Btd5IErbr3Nrc3Jd+fSzk3Unr4kIEhEY7iGXG1RNYNp DJPuqIjtVCajrmPfUpcduChbCXiavaZ9gC8CCUjiAvWAc6X7J7Yzwz4cxouqTIQSEGsn hHOg== X-Forwarded-Encrypted: i=1; AJvYcCUlse+6YAmG1FAdOJLalQ7fipirV3ACz2pxwe6YPT+WePcyZ4GEJsvhj0b+MxshkYq3K1jg4X3deeVXQpzn@postgresql.org X-Gm-Message-State: AOJu0Yxoo7IhoNRNZmlKp0z5R/32M3PhbKULWfc/Bsu1fGXw9NAUpMn9 mTT2LlPGCqlFL638jGI0bbHX/ZZuQIW7jvJwWrbT7MN3pjRlftOM0CjuxNrNgNNyMt/pvY0Z8wI zI3fF3m8J6P0rvVV3JO1WQ95TTPWvFuM= X-Gm-Gg: AY/fxX51SOdg8kAaBpMwgVgkwOf7G5eDV7rmesPxporOZtBZEign5rEnY6OaSD5Hzc5 XevrNa44PDTxo8eyz1POpEidTF5TWE5axEXIrwC4RGx9EmO3eOR3U4KHlHm8WeaPyyPn2OmpmsZ lXNIoAATxRyi+fxDX3cgady9rB4kUA3eqT55+rBLrivKXgaJnnBQwBvvVI5UDgxAbV3ZfSer1o+ fZNcxAp+Vk2bmWZDu9/jijjZpJDhYUivCMdusUvy6FHcbLKrMaFhJYkivlgLPw5gBHNnCo7BJda z4mTNsD44A1Y4+r/UEv5wLm1/qMqiA== X-Google-Smtp-Source: AGHT+IE+Hcot+T/bOIgrp1kO6x4APvzFRY7Ji74hByH73lnBpkGo6ai+ityNiUtv7jBCgHuO3Ar16JbMeATU47FeU70= X-Received: by 2002:a05:6512:23a2:b0:577:6e42:3718 with SMTP id 2adb3069b0e04-598faa0176amr5612711e87.7.1765922348195; Tue, 16 Dec 2025 13:59:08 -0800 (PST) MIME-Version: 1.0 References: <5784.1764580169@localhost> <5293.1764837294@localhost> <120851.1764875744@localhost> <4b368c17-a614-401b-a335-398450249c47@iki.fi> In-Reply-To: <4b368c17-a614-401b-a335-398450249c47@iki.fi> From: Mihail Nikalayeu Date: Tue, 16 Dec 2025 22:58:00 +0100 X-Gm-Features: AQt7F2ouWRGp82A_38ZdRmd5zoU8MsOyPiUCw0sbYNqhRZ9itwg3-JBDVfpRLnc Message-ID: Subject: Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements To: Heikki Linnakangas Cc: Hannu Krosing , PostgreSQL Hackers , Antonin Houska , Matthias van de Meent , Sergey Sargsyan , alvherre@kurilemu.de, Andres Freund , Michael Paquier , Andrey Borodin , Melanie Plageman Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hello, Heikki! On Tue, Dec 16, 2025 at 2:43=E2=80=AFPM Heikki Linnakangas wrote: > Firstly, I think the STIR approach is the right approach at the high > level. I don't like the logical decoding idea, for the reasons Matthias > and Mikhail already mentioned. Maybe there's some synergy with REPACK, > but it feels different enough that I doubt it. Let's focus on the STIR > approach. Thanks for checking that thread. > In the first transaction that inserts the catalog entry with > indisready=3Dfalse, also create a shmem struct. In that struct, we can > store information about what state the build is in, and whether > insertions should go to the STIR or to the real index. Yes, it might look simpler, but from other point of view: * we need to check that shmem for each index insert (whenever we build something or not) * or we need to put something into an index list with information "write instead of that index into that shmem" * currently we have some proven mechanics related to transactions, catalog snapshots, relcache, invalidation etc. Some tricky synchronization may be required here (to avoid any drift of way transaction see shmem and relcache). > As one small incremental improvement, we could use the shmem struct to > avoid one of the "wait for all transactions" steps in the current > implementation. In validate_index(), after we mark the index as > 'indisready' we have to wait for all transactions to finish, to ensure > that all subsequent insertions have seen the indisready=3Dtrue change. We > could avoid that by setting a flag in the shmem struct instead, so that > all backends would see instantly that the flag is flipped. That may be tricky. If I set a flag - what if someone checked it 1ns ago and decided it is not required to write something in the index? How to ensure that now everyone really knows about it without heavy locking? In all current maintenance operations we ensure in some way (by locking\unlocking a relation or waiting for transactions) everyone has fresh enough relcache. Don't think we should involve anything special for the CIC scenario here. But some universal solution (like ensuring that every other transaction that had an outdated relcache is ended) may benefit all related scenarios. > Improved STIR approach > > Here's another proposal using the STIR approach. It's a little different > from the patches so far: > .... > 7. Retail insert all the tuples from the STIR to the index. Hm, that clever idea... At the same time my tests show what index scan is light compared to heap scans (especially second one - it is not paralleled). > Snapshot refreshing > ------------------- > - In step 4, while we are building the index, we can periodically get a > new snapshot, update the cutoff in the shmem struct, and drain the STIR > of the tuples that are already in it. But together with snapshot resetting such an approach is still more effective (in terms of index scan) but feels much more complex, including some complex locking. Need to think a little bit here. Best regards, Mikhail.