public inbox for [email protected]
help / color / mirror / Atom feedFrom: Dilip Kumar <[email protected]>
To: Hannu Krosing <[email protected]>
Cc: David Rowley <[email protected]>
Cc: Zsolt Parragi <[email protected]>
Cc: Ashutosh Bapat <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Cc: Nathan Bossart <[email protected]>
Subject: Re: Patch: dumping tables data in multiple chunks in pg_dump
Date: Thu, 12 Feb 2026 11:43:22 +0530
Message-ID: <CAFiTN-s50GtFf650TT9Fko+q5rc+xwm+x106ugB3BE7_xgGjPQ@mail.gmail.com> (raw)
In-Reply-To: <CAMT0RQQKWWrYYYQ8QNurs7hXBC5DwBAV6b0JmHKvJk7wZnup0g@mail.gmail.com>
References: <CAMT0RQT_0qVxcTT6ycM20QUN-pEQ6iMLbz6gLWgLpeF0NmNOUA@mail.gmail.com>
<CAExHW5t54GPKFbW3KLzintJ6jMMRYwb-t2Fjm4JTxEcZbGDomA@mail.gmail.com>
<CAMT0RQTHoL8S7OonFWC_aDSC-2oX7BGBBLAQ+OOBhRPcxV2eiw@mail.gmail.com>
<CAMT0RQQAH1a8kY-mx7B07Uzn3T_zeaU9detqFFtW36_k67Su+A@mail.gmail.com>
<CAMT0RQQr7KtPAY903+F42csiHc1EPHo70Xji-znkxEhwdoKa6w@mail.gmail.com>
<CAMT0RQSNHFffbCmDNxQogVBD8H5gTDJNwhUR2btCVE+Lq1sGGw@mail.gmail.com>
<CAMT0RQTEFGctCfgVx3u2XgVRCAj_QURV2tfdzL0HOQi=u0sV2A@mail.gmail.com>
<CAApHDvr8ay+31Wd0TptDGp8cAg2-NOnWddx8csnUE3R03EbvZw@mail.gmail.com>
<CAMT0RQShjXPPdXQS-5uzDC3bXt+QEZR5tO02o1NHdXWNu2quvw@mail.gmail.com>
<CAN4CZFOTOX1nFkPrmM7fQ9qnrccvjwywXPf4Eo7C+DF-h-x96g@mail.gmail.com>
<CAMT0RQTbQxjdN5nv6M_HhFWiLqdT84=NYBM1ZYQKaAcf8Ufyaw@mail.gmail.com>
<CAN4CZFOjG2kUciKeVBpxrHJZNkZuzY_Q5ij_qpDZcAS3Ak2GxA@mail.gmail.com>
<CAMT0RQStjytRrGTe0X03ErC7anwxNRHAULYBsSmdWZV3fr4-Dg@mail.gmail.com>
<CAMT0RQROPMSPwfxAUCm1gZs9cUr7FmvwX+eO6Wzq_wWdd6eEAQ@mail.gmail.com>
<CAMT0RQQ8DX+K7OTw3Lg+Yp2ew8TsZduiqtPszfiBixcpxKbz-A@mail.gmail.com>
<CAApHDvo29-vQz=xV6+x5hU--NZ9qGPXsCNBuOAf88pAHjTpvvQ@mail.gmail.com>
<CAMT0RQQT1dQTPj4etRTc0877mirCPVKzjbF_U5KRAyPwhHMr0Q@mail.gmail.com>
<CAMT0RQQKWWrYYYQ8QNurs7hXBC5DwBAV6b0JmHKvJk7wZnup0g@mail.gmail.com>
On Wed, Jan 28, 2026 at 11:00 PM Hannu Krosing <[email protected]> wrote:
>
> v13 has added a proper test comparing original and restored table data
>
I was reviewing v13 and here are some initial comments I have
1. IMHO the commit message details about the work progress instead of
a high level idea about what it actually does and how.
Suggestion:
SUBJECT: Add --max-table-segment-pages option to pg_dump for parallel
table dumping.
This patch introduces the ability to split large heap tables into segments
based on a specified number of pages. These segments can then be dumped in
parallel using the existing jobs infrastructure, significantly reducing
the time required to dump very large tables.
The implementation uses ctid-based range queries (e.g., WHERE ctid >=
'(start,1)'
AND ctid <= '(end,32000)') to extract specific chunks of the relation.
<more architecture details and limitation if any>
2.
+ pg_log_warning("CHUNKING: set dopt.max_table_segment_pages to [%u]",
dopt.max_table_segment_pages);
+ break;
IMHO we don't need to place warning here while processing the input parameters
3.
+ printf(_(" --max-table-segment-pages=NUMPAGES\n"
+ " Number of main table pages
above which data is \n"
+ " copied out in chunks, also
determines the chunk size\n"));
Check the comment formatting, all the parameter description starts
with lower case, so better we start with "number" rather than "Number"
4.
+ if (is_segment(tdinfo))
+ {
+ appendPQExpBufferStr(q, tdinfo->filtercond?" AND ":" WHERE ");
+ if(tdinfo->startPage == 0)
+ appendPQExpBuffer(q, "ctid <= '(%u,32000)'", tdinfo->endPage);
+ else if(tdinfo->endPage != InvalidBlockNumber)
+ appendPQExpBuffer(q, "ctid BETWEEN '(%u,1)' AND '(%u,32000)'",
+ tdinfo->startPage, tdinfo->endPage);
+ else
+ appendPQExpBuffer(q, "ctid >= '(%u,1)'", tdinfo->startPage);
+ pg_log_warning("CHUNKING: pages [%u:%u]",tdinfo->startPage, tdinfo->endPage);
+ }
IMHO we should explain this chunking logic in the comment above this code block?
--
Regards,
Dilip Kumar
Google
view thread (24+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: Patch: dumping tables data in multiple chunks in pg_dump
In-Reply-To: <CAFiTN-s50GtFf650TT9Fko+q5rc+xwm+x106ugB3BE7_xgGjPQ@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox