public inbox for [email protected]  
help / color / mirror / Atom feed
From: Hannu Krosing <[email protected]>
To: Michael Banck <[email protected]>
Cc: David Rowley <[email protected]>
Cc: Ashutosh Bapat <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Cc: Nathan Bossart <[email protected]>
Subject: Re: Patch: dumping tables data in multiple chunks in pg_dump
Date: Sat, 28 Mar 2026 16:33:59 +0100
Message-ID: <CAMT0RQTe4Zr=rdcKMJj-=c7CH0PJh=ZPk=xOU98+M7p9-D+Yew@mail.gmail.com> (raw)
In-Reply-To: <CAMT0RQRtLwi_CrOcD7KxYL0Gm1nGXb-HWmerVg=ajEs6JP7m+w@mail.gmail.com>
References: <CAMT0RQT_0qVxcTT6ycM20QUN-pEQ6iMLbz6gLWgLpeF0NmNOUA@mail.gmail.com>
	<CAExHW5t54GPKFbW3KLzintJ6jMMRYwb-t2Fjm4JTxEcZbGDomA@mail.gmail.com>
	<CAMT0RQTHoL8S7OonFWC_aDSC-2oX7BGBBLAQ+OOBhRPcxV2eiw@mail.gmail.com>
	<CAMT0RQQAH1a8kY-mx7B07Uzn3T_zeaU9detqFFtW36_k67Su+A@mail.gmail.com>
	<CAMT0RQQr7KtPAY903+F42csiHc1EPHo70Xji-znkxEhwdoKa6w@mail.gmail.com>
	<CAMT0RQSNHFffbCmDNxQogVBD8H5gTDJNwhUR2btCVE+Lq1sGGw@mail.gmail.com>
	<CAMT0RQTEFGctCfgVx3u2XgVRCAj_QURV2tfdzL0HOQi=u0sV2A@mail.gmail.com>
	<CAApHDvr8ay+31Wd0TptDGp8cAg2-NOnWddx8csnUE3R03EbvZw@mail.gmail.com>
	<[email protected]>
	<CAMT0RQRtLwi_CrOcD7KxYL0Gm1nGXb-HWmerVg=ajEs6JP7m+w@mail.gmail.com>

The above

"Or it can be almost 200 GB if the page has just pointers to 1GB TOAST items."

should read

"Or it can be almost 200 GB *for a single page* if the page has just
pointers to 1GB TOAST items."


On Sat, Mar 28, 2026 at 4:32 PM Hannu Krosing <[email protected]> wrote:
>
> The issue is that currently the value is given in "main table pages"
> and it would be somewhat deceptive, or at least confusing, to try to
> express this in any other unit.
>
> As I explained in the commit message:
>
> ---------8<-------------------8<-------------------8<----------------
> This --max-table-segment-pages number specifically applies to main table
> pages which does not guarantee anything about output size.
> The output could be empty if there are no live tuples in the page range.
> Or it can be almost 200 GB if the page has just pointers to 1GB TOAST items.
> ---------8<-------------------8<-------------------8<----------------
>
> And I can think of no cheap and reliable way to change that equation.
>
> I'll be very happy if you have any good ideas for either improving the
> flag name, or even propose a way to better estimate the resulting dump
> file size so we could give the chunk size in better units
>
> ---
> Hannu
>
>
>
>
>
> On Sat, Mar 28, 2026 at 12:26 PM Michael Banck <[email protected]> wrote:
> >
> > Hi,
> >
> > On Tue, Jan 13, 2026 at 03:27:25PM +1300, David Rowley wrote:
> > > Perhaps --max-table-segment-pages is a better name than
> > > --huge-table-chunk-pages as it's quite subjective what the minimum
> > > number of pages required to make a table "huge".
> >
> > I'm not sure that's better - without looking at the documentation,
> > people might confuse segment here with the 1GB split of tables into
> > segments. As pg_dump is a very common and basic user tool, I don't think
> > implementation details like pages/page sizes and blocks should be part
> > of its UX.
> >
> > Can't we just make it a storage size, like '10GB' and then rename it to
> > --table-parallel-threshold or something? I agree it's bikeshedding, but
> > I personally don't like either --max-table-segment-pages or
> > --huge-table-chunk-pages.
> >
> >
> > Michael





view thread (34+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Patch: dumping tables data in multiple chunks in pg_dump
  In-Reply-To: <CAMT0RQTe4Zr=rdcKMJj-=c7CH0PJh=ZPk=xOU98+M7p9-D+Yew@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox