public inbox for [email protected]  
help / color / mirror / Atom feed
From: Ron Johnson <[email protected]>
To: pgsql-general <[email protected]>
Subject: Re: Multiple COPY statements for one table vs one for ~half a billion records
Date: Thu, 4 Apr 2024 14:15:40 -0400
Message-ID: <CANzqJaCBmGpbqc6RS-BubJpeW3aUNiM3knQL0KWOmCFdoeOQvg@mail.gmail.com> (raw)
In-Reply-To: <CAPtGvF9i5XunrgFUWYrCLnmnD0akdLKBQLdO1qsz9C5nz0m3ZQ@mail.gmail.com>
References: <CAPtGvF9i5XunrgFUWYrCLnmnD0akdLKBQLdO1qsz9C5nz0m3ZQ@mail.gmail.com>

On Thu, Apr 4, 2024 at 2:04 PM Carl L <[email protected]> wrote:

> Hi there,
>
> I have around half a billion records that are being generated from a back
> end that are split into 80 threads (one per core) and I'm performing a copy
> from memory ( from stdin binary) into Postgres from each of these threads -
> i.e. there are 80 COPY statements being generated for one table that are
> running concurrently. I can see each of the Postgres processes sitting at
> around 15% CPU usage.
>

Is the target table partitioned in the same way that the input data is
split?

That would make things faster...


> These are all also in the same transaction - I am the only one connected,
> so it's not an issue to hold a big transaction.
>

Unless it fills up your WAL partition.

>


view thread (2+ messages)

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected]
  Subject: Re: Multiple COPY statements for one table vs one for ~half a billion records
  In-Reply-To: <CANzqJaCBmGpbqc6RS-BubJpeW3aUNiM3knQL0KWOmCFdoeOQvg@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox