public inbox for [email protected]
help / color / mirror / Atom feedMultiple COPY statements for one table vs one for ~half a billion records
2+ messages / 2 participants
[nested] [flat]
* Multiple COPY statements for one table vs one for ~half a billion records
@ 2024-04-04 18:03 Carl L <[email protected]>
0 siblings, 1 reply; 2+ messages in thread
From: Carl L @ 2024-04-04 18:03 UTC (permalink / raw)
To: pgsql-general
Hi there,
I have around half a billion records that are being generated from a back
end that are split into 80 threads (one per core) and I'm performing a copy
from memory ( from stdin binary) into Postgres from each of these threads -
i.e. there are 80 COPY statements being generated for one table that are
running concurrently. I can see each of the Postgres processes sitting at
around 15% CPU usage.
These are all also in the same transaction - I am the only one connected,
so it's not an issue to hold a big transaction.
I can see that many of the Postgres threads have a wait event "LWLock:
BufferContent", which I assume means that they are waiting for each other
before they can write to the table. Therefore, would it be more efficient
to combine all of these and put them into one COPY statement?
Thanks!
^ permalink raw reply [nested|flat] 2+ messages in thread
* Re: Multiple COPY statements for one table vs one for ~half a billion records
@ 2024-04-04 18:15 Ron Johnson <[email protected]>
parent: Carl L <[email protected]>
0 siblings, 0 replies; 2+ messages in thread
From: Ron Johnson @ 2024-04-04 18:15 UTC (permalink / raw)
To: pgsql-general
On Thu, Apr 4, 2024 at 2:04 PM Carl L <[email protected]> wrote:
> Hi there,
>
> I have around half a billion records that are being generated from a back
> end that are split into 80 threads (one per core) and I'm performing a copy
> from memory ( from stdin binary) into Postgres from each of these threads -
> i.e. there are 80 COPY statements being generated for one table that are
> running concurrently. I can see each of the Postgres processes sitting at
> around 15% CPU usage.
>
Is the target table partitioned in the same way that the input data is
split?
That would make things faster...
> These are all also in the same transaction - I am the only one connected,
> so it's not an issue to hold a big transaction.
>
Unless it fills up your WAL partition.
>
^ permalink raw reply [nested|flat] 2+ messages in thread
end of thread, other threads:[~2024-04-04 18:15 UTC | newest]
Thread overview: 2+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2024-04-04 18:03 Multiple COPY statements for one table vs one for ~half a billion records Carl L <[email protected]>
2024-04-04 18:15 ` Ron Johnson <[email protected]>
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox