Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uRCWs-008TCp-3X for pgsql-hackers@arkaria.postgresql.org; Mon, 16 Jun 2025 16:17:50 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1uRCWo-001OjZ-Pe for pgsql-hackers@arkaria.postgresql.org; Mon, 16 Jun 2025 16:17:47 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uRCWo-001OjR-FK for pgsql-hackers@lists.postgresql.org; Mon, 16 Jun 2025 16:17:47 +0000 Received: from mail-qk1-x733.google.com ([2607:f8b0:4864:20::733]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1uRCWn-002LIe-0S for pgsql-hackers@postgresql.org; Mon, 16 Jun 2025 16:17:46 +0000 Received: by mail-qk1-x733.google.com with SMTP id af79cd13be357-7d38d1eae03so454878185a.2 for ; Mon, 16 Jun 2025 09:17:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1750090664; x=1750695464; darn=postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=ZIPop46fK4Y7DkKUXACEUhrSsejk9EDkM9PKrNJSQhQ=; b=W28lI8n87s3oR2zOD3yzFUI1c9GB/l0zOsBX1Dxg18KbeZ9cVSD+dVCj0IaXtmrEyh 3WKgIkKamCigTChMFOuCaw6hEN6+M1YMcLBSiajwDnoTXRdHAmoUn7Vy0bk3YAf0//f9 NatibrKX6lUqaWTBIUyixg1reK30z8JRSvUhIR0NcvaA7MIudbhHCJn/rMffC8QpSbxV W/MFdeUv5+j8YMHEJLzC4A5zmyZd+0jPNHAqnB2xtB/XFTErI7ubXvtwpmuAL0vhSxrR RtW5nFHwBXUvIepg54F0SmUmb0bNxlH2dvZrerSCLBwpWZ69sNYVJmBAc4TNgMt3aoH9 Oyfg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1750090664; x=1750695464; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=ZIPop46fK4Y7DkKUXACEUhrSsejk9EDkM9PKrNJSQhQ=; b=s7wVSBCDo+XNEo6J/TxW/IEBAlcaPhqAfqm50WJIuHss9Q4+4e7eJwZViYZbaKP+OB q+eLJk5yEPvZ/0jVS6BgB4MrwOC5ANgomCUM5Dyzr2daCGpSAaVDsq35LQdsJiWfiP3q MN3OcqpVQJfwjz2YCWktCRRZRsFLbyEcGUBJ/10BelGWCZUxNJp5HMnCpUHiotGK11i9 MVFo+XYELo7rgRnCM55yjHL6vY95B3aEA2s3fR8BSrHWvt54qD4aQOkgo0j3MF9EPLmd s6/zrEAjzrnEcGmky92z7ZeI5ZZqcx+RkMcw5RYIzB0TM/cjl4fHr2g7vlYfTjUSFY+U X04Q== X-Forwarded-Encrypted: i=1; AJvYcCUHcVRCtSatI7D1dNCfNSbONhal2gBPfoC3F//H1zM1unxNJPKTVYcSOzx79csZM8tRcRX0Eo3hE/yM6uZ3@postgresql.org X-Gm-Message-State: AOJu0Yx/zBpg/WZJ8fL1rQtMwY7SHeDyVJUFDJwPgiaZ4uci+Qc4FRg2 ePOdES14VMTSoK18rsE3kwzyE5QsID0gADOjLI6NxfajMokR036ZqPi4fMuca1kSrjKgcZ72SZe q4hhc9ZeB9vkWWQ7r/k1B8hvRMRBaZTvO5/bS X-Gm-Gg: ASbGncscR6y1lL8+JABxOIdxBON9YOBuwTX/fPFT4xX3JSalL4qBToWDfHEj3MY0dsA GrmQ+bXtIjmu1XvRU1QS8dQ3vbeAiuK54NNF3slPsOGHihbYT8/NoWrjPUHccPfMeZ2ipEGM4bn JmVWnClOR/f0QJBy9b+RHGHV87MyB0+1XHchHc9aUaRHcZhw== X-Google-Smtp-Source: AGHT+IF9peQr7SDUuNkJQSQLqQS2Y53z2dJMyufgqGMPnD/ZaPrcWoExswB6NfEeIigmUPpEycA2tKT3Lh9+V2e3zgk= X-Received: by 2002:a05:620a:240d:b0:7d3:8fa7:8885 with SMTP id af79cd13be357-7d3c6c1f5b8mr1586146285a.22.1750090664070; Mon, 16 Jun 2025 09:17:44 -0700 (PDT) MIME-Version: 1.0 References: <202505181556.3n6oiowvntyr@alvherre.pgsql> In-Reply-To: From: Sergey Sargsyan Date: Mon, 16 Jun 2025 19:17:33 +0300 X-Gm-Features: AX0GCFssylv4xcceMpPDCAqGhitPoZi04g4tnpZ_F9a4-Ktt4SqVAFNRrZnG_zI Message-ID: Subject: Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements To: Mihail Nikalayeu Cc: =?UTF-8?Q?=C3=81lvaro_Herrera?= , Andres Freund , Michael Paquier , PostgreSQL Hackers , Andrey Borodin , Melanie Plageman , Matthias van de Meent Content-Type: multipart/alternative; boundary="0000000000005b512c0637b2bc78" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --0000000000005b512c0637b2bc78 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hey Mihail, I've started looking at the patches today, mostly the STIR part. Seems solid, but I've got a question about validation. Why are we still grabbing tids from the main index and sorting them? I think it's to avoid duplicate errors when adding tuples from STIP to the main index, but couldn't we just suppress that error during validation and skip the new tuple insertion if it already exists? The main index may get huge after building, and iterating over it in a single thread and then sorting tids can be time consuming. At least I guess one can skip it when STIP is empty. But, I think we could skip it altogether by figuring out what to do with duplicates, making concurrent and non-concurrent index creation almost identical in speed (only locking and atomicity would differ). p.s. I noticed that `stip.c` has a lot of functions that don't follow the Postgres coding style of return type on separate line. On Mon, Jun 16, 2025, 6:41=E2=80=AFPM Mihail Nikalayeu wrote: > Hello, everyone! > > Rebased, patch structure and comments available here [0]. Quick > introduction poster - here [1]. > > Best regards, > Mikhail. > > [0]: > https://www.postgresql.org/message-id/flat/CADzfLwVOcZ9mg8gOG%2BKXWurt%3D= MHRcqNv3XSECYoXyM3ENrxyfQ%40mail.gmail.com#52c97e004b8f628473124c05e3bf2da1 > [1]: > https://www.postgresql.org/message-id/attachment/176651/STIR-poster.pdf > --0000000000005b512c0637b2bc78 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hey Mihail,

= I've started looking at the patches today, mostly the STIR part. Seems = solid, but I've got a question about validation. Why are we still grabb= ing tids from the main index and sorting them?

<= /div>
I think it's to avoid duplicate errors when addi= ng tuples from STIP to the main index, but couldn't we just suppress th= at error during validation and skip the new tuple insertion if it already e= xists?

The main index ma= y get huge after building, and iterating over it in a single thread and the= n sorting tids can be time consuming.

At least I guess one can skip it when STIP is empty. But, I t= hink we could skip it altogether by figuring out what to do with duplicates= , making concurrent and non-concurrent index creation almost identical in s= peed (only locking and atomicity would differ).

=
p.s. I noticed that `stip.c` has a lot of functions= that don't follow the Postgres coding style of return type on separate= line.

On Mon, Jun 16, 2025, 6:41=E2=80=AFPM Mih= ail Nikalayeu <mihailnikala= yeu@gmail.com> wrote:
Hello,= everyone!

Rebased, patch structure and comments available here [0]. Quick
introduction poster - here [1].

Best regards,
Mikhail.

[0]: https://www= .postgresql.org/message-id/flat/CADzfLwVOcZ9mg8gOG%2BKXWurt%3DMHRcqNv3XSECY= oXyM3ENrxyfQ%40mail.gmail.com#52c97e004b8f628473124c05e3bf2da1
[1]: https://www.p= ostgresql.org/message-id/attachment/176651/STIR-poster.pdf
--0000000000005b512c0637b2bc78--