Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vNdhh-009bxm-1b for pgsql-hackers@arkaria.postgresql.org; Mon, 24 Nov 2025 21:02:33 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vNdhg-004Ks4-0D for pgsql-hackers@arkaria.postgresql.org; Mon, 24 Nov 2025 21:02:32 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vNdhf-004Krv-2J for pgsql-hackers@lists.postgresql.org; Mon, 24 Nov 2025 21:02:32 +0000 Received: from mail-qt1-x82d.google.com ([2607:f8b0:4864:20::82d]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1vNdhd-001IVj-2Q for pgsql-hackers@lists.postgresql.org; Mon, 24 Nov 2025 21:02:31 +0000 Received: by mail-qt1-x82d.google.com with SMTP id d75a77b69052e-4ee147baf7bso17311cf.1 for ; Mon, 24 Nov 2025 13:02:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764018148; x=1764622948; darn=lists.postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=BdUoHUD+9qlYsuQyyKrf8Np1ERuTbaRXcGkLHuHAnbc=; b=sEFrAnkegrbWYLDwpH1D0KcCX65u104uFWl3c62yer9RdFhjIwRmg+v/S35NiZJFJH ENMikb+snMw25piOVanQr3IoNRE+3/wI6oRsk90ApCcU9V2i9H2Sd7eZpGh5LxjCICcp hi+0R4T8lc9uTb6hU/m2wSoqvVq8o+sxppjxsO+A/IVi84W+V2r/xhI7t8Nn56SF+8+e IjCEuoahGG+K/49zIfT/HZHT31oaNoz7H4t72Unlh9XzsQRg6/YxghPe1r1Fyklzeq2Z pLXI02lrhDmePi2SQzMj/BNLF0pA/TOQAdVbhIA68HH1Ga0tA/URxpqhZxyNYjXIwUJr mJUQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764018148; x=1764622948; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=BdUoHUD+9qlYsuQyyKrf8Np1ERuTbaRXcGkLHuHAnbc=; b=qgrYNLvlSphOvr3FDZ8WPCTlJg4p5xfuHIi7hXt6Eqb2KfCx6nynOTJpY45V+ENLSA zJhqnHg0wBkoCXyecRizMzoo3YvA/cJGXn460eTZlIRfoQ0pAFLD7dkaDS+4EXCJ0wbS +y2EDPM3a278N055FaHzO+ROEifHk7YKRhJrGxM+EgRqOYW9DAFNwAZ7XXtwj+wgEDgG 9IZ5cCk4CgLtVAX1xYiYR/acRO2cipLeJGfGeTS3/VRciisSKUGRFAqobVdGmfA+PGc2 pVytm+6jBDGyZcHlmbFamOl6gEEdywmmXjCJLynGpKpMDa2YxVu/N1FnIvxWu3EasQ17 xKRA== X-Gm-Message-State: AOJu0YzQbNWEH0U5xoelWWHaf/IbTEVN7kLCgmcLE5EnDqYbNSv+GRcL Velyo/b4EKo1KjPDkXI3o3AMZ26XohvQRpb20yPGjSJa027v9a1bYI61itN03DCfXZ91pkmkdiD 8VZMhuU/RItJr9IzwVepw6GthETEqHmgxptFyTTMd X-Gm-Gg: ASbGncu6ymTpHC5yde6mXjXy3q0y1s/Iy5OXVYUsV4XkWEMMnI8DgCWeWYT95I/6tpE uxL6539qtFY1Waq2LJ1TElOEUr2O0fHPwpKW0ZY88JTWVjHzSqS3TUBn8tB8vC2rtgpVyt0Cqqr eEnNredeqDk5wkr55+VqYWTZJz13ZGcAN3HZTHnTvqyMCR8v8PAw+aSc6aqBCoZIy663Gcfruh+ EOi4RMC2UrxTGL4FfnpVaeSCzbfET19609jngLrnu3g5VwCHNBjAyA63Df0jXMsgXD+t8lc4Vx7 Tg83ehG6dgODDZmDPJsZt00D5w== X-Google-Smtp-Source: AGHT+IHaEfz/x0pzliagPVVlbRZQpIPBUkjP0ooccRVl5t/OtRCbRBThQlxrLG559AkBErI7lxJ502M2gPkzNvJDyRc= X-Received: by 2002:a05:622a:1101:b0:4b7:a72f:55d9 with SMTP id d75a77b69052e-4efbddadaadmr345431cf.13.1764018147554; Mon, 24 Nov 2025 13:02:27 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Hannu Krosing Date: Mon, 24 Nov 2025 22:02:15 +0100 X-Gm-Features: AWmQ_bmqBB44qJTPMVC3JPsyEmAYDDNALxvAopJrJq3Q384xPQN8TFcUEOh7DWI Message-ID: Subject: Re: Patch: dumping tables data in multiple chunks in pg_dump To: Dilip Kumar Cc: PostgreSQL Hackers , Nathan Bossart Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk The expectation was that as chunking is useful mainly in case of really huge tables the analyze should have been run "recently enough". Maybe we should use pg_relation_size() in case we have already determined that the table is large enough to warrant chunking? Maybe at least 1/2 of the requested chunk size? My reasoning was to not put too much extra load on pg_dump in case chunking is not required. But of course we can use the presence of a chunking request to decide to run pg_relation_size(), assuming the overhead won't be too large in this case. On Mon, Nov 17, 2025 at 5:15=E2=80=AFAM Dilip Kumar = wrote: > > On Tue, Nov 11, 2025 at 9:00=E2=80=AFPM Hannu Krosing = wrote: > > > > Attached is a patch that adds the ability to dump table data in multipl= e chunks. > > > > Looking for feedback at this point: > > 1) what have I missed > > 2) should I implement something to avoid single-page chunks > > > > The flag --huge-table-chunk-pages which tells the directory format > > dump to dump tables where the main fork has more pages than this in > > multiple chunks of given number of pages, > > > > The main use case is speeding up parallel dumps in case of one or a > > small number of HUGE tables so parts of these can be dumped in > > parallel. > > > > +1 for the idea, I haven't done the detailed review but I was just > going through the patch, I noticed that we use pg_class->relpages to > identify whether to chunk the table or not, which should be fine but > don't you think if we use direct size calculation function like > pg_relation_size() we might get better idea and not dependent upon > whether the stats are updated or not? This will make chunking > behavior more deterministic. > > -- > Regards, > Dilip Kumar > Google