public inbox for [email protected]  
help / color / mirror / Atom feed
From: Jeff Ross <[email protected]>
To: [email protected]
Subject: Re: Faster data load
Date: Thu, 5 Sep 2024 16:21:09 -0600
Message-ID: <[email protected]> (raw)
In-Reply-To: <CAKna9VaVsDzfOfOGu1+grStp9BBHFMKrH5DCEbbtGcQUWJ74KQ@mail.gmail.com>
References: <CAKna9VaVsDzfOfOGu1+grStp9BBHFMKrH5DCEbbtGcQUWJ74KQ@mail.gmail.com>

On 9/5/24 14:14, Lok P wrote:

> Hi,
>
> We are having a requirement to create approx 50 billion rows in a 
> partition table(~1 billion rows per partition, 200+gb size daily 
> partitions) for a performance test. We are currently using ' insert 
> into <target table_partition> select.. From <source_table_partition> 
> or <some transformed query>;' method . We have dropped all indexes and 
> constraints First and then doing the load. Still it's taking 2-3 hours 
> to populate one partition. Is there a faster way to achieve this?
>
> Few teammate suggesting to use copy command and use file load instead, 
> which will be faster. So I wanted to understand, how different things 
> it does behind the scenes as compared to insert as select command? As 
> because it only deals with sql engine only.
>
> Additionally, when we were trying to create indexes post data load on 
> one partition, it took 30+ minutes. Any possible way to make it faster?
>
> Is there any way to drive the above things in parallel by utilizing 
> full database resources?
>
> It's postgres 15.4
>
> Regards
> Lok

Try pg_bulkload to load the data--takes a little set up but it is very 
fast.  Do pay attention to the caveats.  For a performance test they 
probably won't be relevant.

https://github.com/ossc-db/pg_bulkload?tab=readme-ov-file

Jeff







view thread (9+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected]
  Subject: Re: Faster data load
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox