Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sGAl3-00A2f8-GY for pgsql-general@arkaria.postgresql.org; Sun, 09 Jun 2024 05:06:22 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1sGAl1-002jWl-Fh for pgsql-general@arkaria.postgresql.org; Sun, 09 Jun 2024 05:06:20 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sGAl1-002jWc-40 for pgsql-general@lists.postgresql.org; Sun, 09 Jun 2024 05:06:20 +0000 Received: from mail-vk1-xa32.google.com ([2607:f8b0:4864:20::a32]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1sGAkz-000iub-LF for pgsql-general@lists.postgresql.org; Sun, 09 Jun 2024 05:06:19 +0000 Received: by mail-vk1-xa32.google.com with SMTP id 71dfb90a1353d-4eb007a9f6fso1014414e0c.2 for ; Sat, 08 Jun 2024 22:06:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1717909576; x=1718514376; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=bizvJbuhqobYeOv2roppSVKugMlO3OlMY2VPM5YXWdI=; b=IX9cXSW2zqLkIhV6b3sinHAWc0bZJcC/iRfCCUnTROXeCTO6hBopSvIbwcea1TvEV7 aGYG5icgc0IvJIsxbfRvP5Dd+DRwmfgNl9D9fgSgf4B6grzz4Hx7szOfebif2yC218gc UZAtyuJ5ouVGExA4ytSG7vpIx3HN6+TNeVvhb4gAZBJBymZ7IYnbFRL2itlZOFICNB2K 2Sj2esYuUOuZdMJcQ7P/fQF4loiuU14BDuOklw805Ch5Gg4ZemYwK4fUqSCDfQEGVf+h kvuXXrNeNGwTeUuQs2GINhBDx2Uv6H4akzhFlJ96iokCQRqIsI/ih2g5z/atzpz+M4bn HCrw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717909576; x=1718514376; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=bizvJbuhqobYeOv2roppSVKugMlO3OlMY2VPM5YXWdI=; b=lJ5GN2BMYtac4ccyNPUHR0Vrpb4STSSQPGjMEg/9Otl78vrWpgNE/FF9Myv4WDASg7 BM05nBB4VU+BxWkPrQVkxWIlszVkDUuFr6ZkFJZngnWC/vGMneFbV/dm6AUUtc8eMmIU bSrqUWMrFqKzrjxvwJupal0tJUVs8DEG9OzoOSH8zG+SJ2ROs+6XfYYkbiE+9GU8SWhX ukQlSk9O2ehvjFYxQV9saqtaWJp4iJ+ebN4c0tD3uCtDCc2zU6MM8xg3hyYkiV9moyki MkXgwqh+zL+dgMYVb/vsLl0TPI8doTeNRGXNMgsX2kel8IdL/WhzSncl65jxYBv0/jz1 9QPw== X-Gm-Message-State: AOJu0Yw/4oq3rPoGHhTT8hWPeoTpKYURwRUYzD1tWb8KOhxg4/7uVCZR YS0eYeSDtvUJ5VNTl9sleIP2gCVSryBlPMmhP0skIuNt4rJSkFiVA8gHmXJAaLh8Sft+Ud4ZeJ7 00yeF7sLsbEX5QP4Psrl9Ud9+7lA= X-Google-Smtp-Source: AGHT+IFRtOKr2cLNncmBJZoLuLuq95AO3VVTiASlkkCGg9JTXZ7SuDRVMgZbIle/E+KIPKN5wlyZ/AOkE9uuE4yKcqI= X-Received: by 2002:ac5:c5ab:0:b0:4d8:4a7f:c166 with SMTP id 71dfb90a1353d-4eb562b291dmr5621811e0c.12.1717909575919; Sat, 08 Jun 2024 22:06:15 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: sud Date: Sun, 9 Jun 2024 10:36:02 +0530 Message-ID: Subject: Re: Creating big indexes To: Lok P Cc: pgsql-general Content-Type: multipart/alternative; boundary="000000000000076e1c061a6dfe88" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000076e1c061a6dfe88 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Sat, Jun 8, 2024 at 12:53=E2=80=AFPM Lok P wrote: > Hello, > We have a few tables having size ~5TB and are partitioned on a timestamp > column. They have ~90 partitions in them and are storing 90 days of data. > We want to create a couple of indexes on those tables. They are getting t= he > incoming transactions(mainly inserts) 24/7 , which are mostly happening o= n > the current day/live partition. Its RDS postgres version 15.4. So in this > situation > > Should we go with below i.e one time create index command on the table.. > > CREATE INDEX CONCURRENTLY idx1 ON tab(column_name); > Or > create index on individual partitions from different sessions, say for > example create indexes on 30 partitions each from three different session= s > so as to finish all the 90 partitions faster? > CREATE INDEX CONCURRENTLY idx1 ON tab_part1(column_name); > CREATE INDEX CONCURRENTLY idx1 ON tab_part2(column_name); > ..... > ..... > > Basically I have three questions: > 1)If we can do this index creation activity online without impacting the > incoming transactions or do we have to take down time for this activity? > 2)If we can't do it online then , what is the fastest method to do this > index creation activity ? > 3)Should we change the DB parameters in a certain way to make the process > faster? We have currently set below parameters > > max_parallel_workers-16 > max_parallel_maintenance_workers-2 > maintenance_work_mem- 4GB > > > You can first create the index on the table using the "On ONLY"keyword, something as below. CREATE INDEX idx ON ONLY tab(col1); Then create indexes on each partition in "concurrently" from multiple sessions in chunks. CREATE INDEX CONCURRENTLY idx_1 ON tab_part1(col1); CREATE INDEX CONCURRENTLY idx_2 ON tab_part2(col1); After this step finishes the table level index which was created in the first step will be in valid state automatically. --000000000000076e1c061a6dfe88 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

On Sat, Jun 8, 2024 at 12:53=E2=80=AFPM L= ok P <loknath.73@gmail.com&g= t; wrote:
Hello,
We have a few tables having size ~5TB and are partiti= oned on a timestamp column. They have ~90 partitions in them and are storin= g 90 days of data. We want to create a couple of indexes on those tables. T= hey are getting the incoming transactions(mainly inserts) 24/7 , which are = mostly happening on the current day/live partition. Its RDS postgres versio= n 15.4. So in this situation

Should we go with below i.e = one time create index command on the table..

CREATE INDEX CONCURRENT= LY idx1 ON tab(column_name);
Or
create index on individual partition= s from different sessions, say for example create indexes on 30 partitions = each from three different sessions so as to finish all the 90 partitions fa= ster?
CREATE INDEX CONCURRENTLY idx1 ON tab_part1(column_name);
CREAT= E INDEX CONCURRENTLY idx1 ON tab_part2(column_name);
.....
<= div>.....

Basically I have three questions:
1)If we can do this i= ndex creation activity online without impacting the incoming transactions o= r do we have to take down time for this activity?
2)If we can't do = it online then , what is the fastest method to do this index creation activ= ity ?
3)Should we change the DB parameters in a certain way to make= the process faster? We have currently set below parameters

max_para= llel_workers-16
max_parallel_maintenance_workers-2
maintenance_work_m= em- 4GB



=C2=A0You can first create the index on the table using th= e "On ONLY"keyword, something as below.

= CREATE INDEX idx ON ONLY tab(col1);

Then creat= e indexes on each partition in "concurrently" from multiple sessi= ons in chunks.

CREATE INDEX CONCURRENTLY idx_1=C2= =A0 =C2=A0 ON tab_part1(col1);
CREATE INDEX CONCURRENTLY idx_2=C2=A0 =C2= =A0 ON tab_part2(col1);

After this step finish= es the table level index which was created in the first step will be in val= id state automatically.

--000000000000076e1c061a6dfe88--