public inbox for [email protected]
help / color / mirror / Atom feedFrom: Peter J. Holzer <[email protected]>
To: [email protected]
Subject: Re: Faster data load
Date: Sun, 8 Sep 2024 19:45:39 +0200
Message-ID: <[email protected]> (raw)
In-Reply-To: <CAKna9VaVsDzfOfOGu1+grStp9BBHFMKrH5DCEbbtGcQUWJ74KQ@mail.gmail.com>
References: <CAKna9VaVsDzfOfOGu1+grStp9BBHFMKrH5DCEbbtGcQUWJ74KQ@mail.gmail.com>
On 2024-09-06 01:44:00 +0530, Lok P wrote:
> We are having a requirement to create approx 50 billion rows in a partition
> table(~1 billion rows per partition, 200+gb size daily partitions) for a
> performance test. We are currently using ' insert into <target table_partition>
> select.. From <source_table_partition> or <some transformed query>;' method .
> We have dropped all indexes and constraints First and then doing the load.
> Still it's taking 2-3 hours to populate one partition.
That seems quite slow. Is the table very wide or does it have a large
number of indexes?
> Is there a faster way to achieve this?
>
> Few teammate suggesting to use copy command and use file load instead, which
> will be faster.
I doubt that.
I benchmarked several strategies for populating tables 5 years ago and
(for my test data and on our hardware at the time - YMMV) s simple
INSERT ... SELECT was more than twice as fast as 8 parallel COPY
operations (and about 8 times as fast as a single COPY).
Details will have changed since then (I should rerun that benchmark on
a current system), but I'd be surprised if COPY became that much faster
relative to INSERT ... SELECT.
hp
--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | [email protected] | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"
Attachments:
[application/pgp-signature] signature.asc (833B, 2-signature.asc)
download
view thread (9+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected]
Subject: Re: Faster data load
In-Reply-To: <[email protected]>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox