public inbox for [email protected]  
help / color / mirror / Atom feed
From: Muhammad Usman Khan <[email protected]>
To: Lok P <[email protected]>
Cc: pgsql-general <[email protected]>
Subject: Re: Faster data load
Date: Fri, 6 Sep 2024 08:49:51 +0500
Message-ID: <CAPnRvGswoifDeosThKM5zkFZzph5u86ozYkrETpGD=oig14pOA@mail.gmail.com> (raw)
In-Reply-To: <CAKna9VaVsDzfOfOGu1+grStp9BBHFMKrH5DCEbbtGcQUWJ74KQ@mail.gmail.com>
References: <CAKna9VaVsDzfOfOGu1+grStp9BBHFMKrH5DCEbbtGcQUWJ74KQ@mail.gmail.com>

Hi,

You can use pg_partman. If your table is partitioned, you can manage
partitions in parallel by distributing the load across partitions
concurrently. Or you can use citus. It can be an excellent solution,
especially for handling large data volumes and parallelizing data operations


On Fri, 6 Sept 2024 at 01:14, Lok P <[email protected]> wrote:

> Hi,
>
> We are having a requirement to create approx 50 billion rows in a
> partition table(~1 billion rows per partition, 200+gb size daily
> partitions) for a performance test. We are currently using ' insert into
> <target table_partition> select.. From <source_table_partition> or <some
> transformed query>;' method . We have dropped all indexes and constraints
> First and then doing the load. Still it's taking 2-3 hours to populate one
> partition. Is there a faster way to achieve this?
>
> Few teammate suggesting to use copy command and use file load instead,
> which will be faster. So I wanted to understand, how different things it
> does behind the scenes as compared to insert as select command? As because
> it only deals with sql engine only.
>
> Additionally, when we were trying to create indexes post data load on one
> partition, it took 30+ minutes. Any possible way to make it faster?
>
> Is there any way to drive the above things in parallel by utilizing full
> database resources?
>
> It's postgres 15.4
>
> Regards
> Lok
>


view thread (9+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected]
  Subject: Re: Faster data load
  In-Reply-To: <CAPnRvGswoifDeosThKM5zkFZzph5u86ozYkrETpGD=oig14pOA@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox