public inbox for [email protected]
help / color / mirror / Atom feedFrom: Hannu Krosing <[email protected]>
To: Ashutosh Bapat <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Cc: Nathan Bossart <[email protected]>
Subject: Re: Patch: dumping tables data in multiple chunks in pg_dump
Date: Thu, 13 Nov 2025 19:02:43 +0100
Message-ID: <CAMT0RQTHoL8S7OonFWC_aDSC-2oX7BGBBLAQ+OOBhRPcxV2eiw@mail.gmail.com> (raw)
In-Reply-To: <CAExHW5t54GPKFbW3KLzintJ6jMMRYwb-t2Fjm4JTxEcZbGDomA@mail.gmail.com>
References: <CAMT0RQT_0qVxcTT6ycM20QUN-pEQ6iMLbz6gLWgLpeF0NmNOUA@mail.gmail.com>
<CAExHW5t54GPKFbW3KLzintJ6jMMRYwb-t2Fjm4JTxEcZbGDomA@mail.gmail.com>
I just ran a test by generating a 408GB table and then dumping it both ways
$ time pg_dump --format=directory -h 10.58.80.2 -U postgres -f
/tmp/plain.dump largedb
real 39m54.968s
user 37m21.557s
sys 2m32.422s
$ time ./pg_dump --format=directory -h 10.58.80.2 -U postgres
--huge-table-chunk-pages=131072 -j 8 -f /tmp/parallel8.dump largedb
real 5m52.965s
user 40m27.284s
sys 3m53.339s
So parallel dump with 8 workers using 1GB (128k pages) chunks runs
almost 7 times faster than the sequential dump.
this was a table that had no TOAST part. I will run some more tests
with TOASTed tables next and expect similar or better improvements.
On Wed, Nov 12, 2025 at 1:59 PM Ashutosh Bapat
<[email protected]> wrote:
>
> Hi Hannu,
>
> On Tue, Nov 11, 2025 at 9:00 PM Hannu Krosing <[email protected]> wrote:
> >
> > Attached is a patch that adds the ability to dump table data in multiple chunks.
> >
> > Looking for feedback at this point:
> > 1) what have I missed
> > 2) should I implement something to avoid single-page chunks
> >
> > The flag --huge-table-chunk-pages which tells the directory format
> > dump to dump tables where the main fork has more pages than this in
> > multiple chunks of given number of pages,
> >
> > The main use case is speeding up parallel dumps in case of one or a
> > small number of HUGE tables so parts of these can be dumped in
> > parallel.
>
> Have you measured speed up? Can you please share the numbers?
>
> --
> Best Wishes,
> Ashutosh Bapat
view thread (16+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected]
Subject: Re: Patch: dumping tables data in multiple chunks in pg_dump
In-Reply-To: <CAMT0RQTHoL8S7OonFWC_aDSC-2oX7BGBBLAQ+OOBhRPcxV2eiw@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox