public inbox for [email protected]help / color / mirror / Atom feed
Creating big indexes 3+ messages / 2 participants [nested] [flat]
* Creating big indexes @ 2024-06-08 07:23 Lok P <[email protected]> 2024-06-09 05:06 ` Re: Creating big indexes sud <[email protected]> 0 siblings, 1 reply; 3+ messages in thread From: Lok P @ 2024-06-08 07:23 UTC (permalink / raw) To: pgsql-general <[email protected]> Hello, We have a few tables having size ~5TB and are partitioned on a timestamp column. They have ~90 partitions in them and are storing 90 days of data. We want to create a couple of indexes on those tables. They are getting the incoming transactions(mainly inserts) 24/7 , which are mostly happening on the current day/live partition. Its RDS postgres version 15.4. So in this situation Should we go with below i.e one time create index command on the table.. CREATE INDEX CONCURRENTLY idx1 ON tab(column_name); Or create index on individual partitions from different sessions, say for example create indexes on 30 partitions each from three different sessions so as to finish all the 90 partitions faster? CREATE INDEX CONCURRENTLY idx1 ON tab_part1(column_name); CREATE INDEX CONCURRENTLY idx1 ON tab_part2(column_name); ..... ..... Basically I have three questions: 1)If we can do this index creation activity online without impacting the incoming transactions or do we have to take down time for this activity? 2)If we can't do it online then , what is the fastest method to do this index creation activity ? 3)Should we change the DB parameters in a certain way to make the process faster? We have currently set below parameters max_parallel_workers-16 max_parallel_maintenance_workers-2 maintenance_work_mem- 4GB Regards Lok ^ permalink raw reply [nested|flat] 3+ messages in thread
* Re: Creating big indexes 2024-06-08 07:23 Creating big indexes Lok P <[email protected]> @ 2024-06-09 05:06 ` sud <[email protected]> 2024-06-09 05:09 ` Re: Creating big indexes Lok P <[email protected]> 0 siblings, 1 reply; 3+ messages in thread From: sud @ 2024-06-09 05:06 UTC (permalink / raw) To: Lok P <[email protected]>; +Cc: pgsql-general <[email protected]> On Sat, Jun 8, 2024 at 12:53 PM Lok P <[email protected]> wrote: > Hello, > We have a few tables having size ~5TB and are partitioned on a timestamp > column. They have ~90 partitions in them and are storing 90 days of data. > We want to create a couple of indexes on those tables. They are getting the > incoming transactions(mainly inserts) 24/7 , which are mostly happening on > the current day/live partition. Its RDS postgres version 15.4. So in this > situation > > Should we go with below i.e one time create index command on the table.. > > CREATE INDEX CONCURRENTLY idx1 ON tab(column_name); > Or > create index on individual partitions from different sessions, say for > example create indexes on 30 partitions each from three different sessions > so as to finish all the 90 partitions faster? > CREATE INDEX CONCURRENTLY idx1 ON tab_part1(column_name); > CREATE INDEX CONCURRENTLY idx1 ON tab_part2(column_name); > ..... > ..... > > Basically I have three questions: > 1)If we can do this index creation activity online without impacting the > incoming transactions or do we have to take down time for this activity? > 2)If we can't do it online then , what is the fastest method to do this > index creation activity ? > 3)Should we change the DB parameters in a certain way to make the process > faster? We have currently set below parameters > > max_parallel_workers-16 > max_parallel_maintenance_workers-2 > maintenance_work_mem- 4GB > > > You can first create the index on the table using the "On ONLY"keyword, something as below. CREATE INDEX idx ON ONLY tab(col1); Then create indexes on each partition in "concurrently" from multiple sessions in chunks. CREATE INDEX CONCURRENTLY idx_1 ON tab_part1(col1); CREATE INDEX CONCURRENTLY idx_2 ON tab_part2(col1); After this step finishes the table level index which was created in the first step will be in valid state automatically. ^ permalink raw reply [nested|flat] 3+ messages in thread
* Re: Creating big indexes 2024-06-08 07:23 Creating big indexes Lok P <[email protected]> 2024-06-09 05:06 ` Re: Creating big indexes sud <[email protected]> @ 2024-06-09 05:09 ` Lok P <[email protected]> 0 siblings, 0 replies; 3+ messages in thread From: Lok P @ 2024-06-09 05:09 UTC (permalink / raw) To: sud <[email protected]>; +Cc: pgsql-general <[email protected]> On Sun, Jun 9, 2024 at 10:36 AM sud <[email protected]> wrote: > > You can first create the index on the table using the "On ONLY"keyword, > something as below. > > CREATE INDEX idx ON ONLY tab(col1); > > Then create indexes on each partition in "concurrently" from multiple > sessions in chunks. > > CREATE INDEX CONCURRENTLY idx_1 ON tab_part1(col1); > CREATE INDEX CONCURRENTLY idx_2 ON tab_part2(col1); > > After this step finishes the table level index which was created in the > first step will be in valid state automatically. > > Thank you so much. Should we also tweak the parameters related to the parallelism and memory as I mentioned in the first post? ^ permalink raw reply [nested|flat] 3+ messages in thread
end of thread, other threads:[~2024-06-09 05:09 UTC | newest] Thread overview: 3+ messages (download: mbox mbox.gz follow: Atom feed) -- links below jump to the message on this page -- 2024-06-08 07:23 Creating big indexes Lok P <[email protected]> 2024-06-09 05:06 ` sud <[email protected]> 2024-06-09 05:09 ` Lok P <[email protected]>
This inbox is served by agora; see mirroring instructions for how to clone and mirror all data and code used for this inbox