Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vJduJ-00EKtu-0O for pgsql-hackers@arkaria.postgresql.org; Thu, 13 Nov 2025 20:27:02 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vJduG-003GEM-2s for pgsql-hackers@arkaria.postgresql.org; Thu, 13 Nov 2025 20:27:00 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vJduG-003GEE-1Q for pgsql-hackers@lists.postgresql.org; Thu, 13 Nov 2025 20:27:00 +0000 Received: from mail-qt1-x835.google.com ([2607:f8b0:4864:20::835]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1vJduE-006z5h-2B for pgsql-hackers@lists.postgresql.org; Thu, 13 Nov 2025 20:26:59 +0000 Received: by mail-qt1-x835.google.com with SMTP id d75a77b69052e-4edb8d6e98aso88141cf.0 for ; Thu, 13 Nov 2025 12:26:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1763065618; x=1763670418; darn=lists.postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=G2mKhqosFneA6xtE1nH5MoylA8ksIZ/c9/zWUocgOO4=; b=YnyejMw9HTM4R+Th+id8ApH+M/n96r3cizXX2OWDk/WKwKFL/srXSWksUMtucaQiEr IsRLvMqZJ3WWH2R+YYhdac3emqzlzDxEjxhFiaTdOKjm3FZk0JMFOzNon0arFysUw8JV DIHSdMCriQ37tOwLHxM26NN7Fq3LlomjuF6GvPhjTr3AIyJPPi/KlFwIHaOdJPQbiUmI qJuTuuTaxQZ8xD5zm0WSb0yoxCwjfs8bw6O/gbHhRKc0TRmeuh/H1GuAJAHlJLrVZGsg R9ANGQTlrUJlel/0Ms0oMRd54j0Mxl0IKVZItqdQ9ltXMtG56yHb0toUvUT4nMEKmXdG 7v2g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1763065618; x=1763670418; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=G2mKhqosFneA6xtE1nH5MoylA8ksIZ/c9/zWUocgOO4=; b=U9u59tutqYh2B+VsKqff8PH6YptiahmBSsmTt8JpaRXa9WuGmwzaVD2sERMM/TRtIA fsGQpyL0f2+jFTqHbbfsuWjC8S7KwBswd10Y8QPsEKlZXwuHdoW1inPyFwdhfxJigm70 ZnbYNXD1lNBpKHZto6GFU+DbPt6WOAdeV0hDrJEiPVR1dPThAq7jEytz9tT9Tgh8u+QQ NG00ymvBfCZuICDQ8mOwSYmmmRTVNqLQ6OQfx/nnZolgpnfk3+MtSso+73emGjXV9OV0 KyIOghopyAPLRgmBA35w2ZzS1vMUf3HOOfb3Oj+AUK8wSHwrpet5kbg2oDY7qYcrZH1Z 8piQ== X-Gm-Message-State: AOJu0YwIkdbRzGb+pe7wv/8XtPd0ImEQ1i19uM/fH188kaZ95Q/UTQpC ne+S2M8XJkRI4S9nSJz/JbXt8PT3GFZtt0vr7cDrobFsr2GijOMMAvz31eMQd2eiENh8L2a2zDX 5ylvPcSNOg8YpU5KlKA0KYAFl/FgRuPr8eYor4Z5sZaoR9Uv5HkcpH0EsxK4= X-Gm-Gg: ASbGncvlTNeix8xl7MDMTe280m9vwPkFi7Xhkvjl5pCmHFU79a2x1J/WsxUAbhBhAeS a9kZNC5Uec22gQWfIQtNdx+oMEuCAKrflDcWQ5Fy4FzS8Rw+84XsCS3tb/UQfHQNE2bdG1CtWh/ lNu43QGVS5Bzk56SlLHSVc6g9t7a/lA3Il89NM6ZbqKQ6I7/EiuVzmO8QYl8YsBj5lHnH2iFMpi 9GT9d8e7NQZyUM0tNHfbRRY+zh3gfhbJ8oMsvhle+bYyESPk3ArZcrtxGQVhq76YkjP5g== X-Google-Smtp-Source: AGHT+IEEFGMs4aSeoVFkvRFZochvwliPIHgnl2g0WNUZYwYFu9elu933T3L4jbf19DnAht7oPRe4v0zEzBW1BnI4TFY= X-Received: by 2002:ac8:5f48:0:b0:4e6:e07f:dc98 with SMTP id d75a77b69052e-4edf3319acbmr1080501cf.9.1763065617438; Thu, 13 Nov 2025 12:26:57 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Hannu Krosing Date: Thu, 13 Nov 2025 21:26:46 +0100 X-Gm-Features: AWmQ_bkftAB5uo7zpmKXXJXCC7VifM1s9Tzv0ExuEFn_ZMxSP-AoPEGDVBRxVTY Message-ID: Subject: Re: Patch: dumping tables data in multiple chunks in pg_dump To: Ashutosh Bapat Cc: PostgreSQL Hackers , Nathan Bossart Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk The reason for small chunk sizes is that they are determined by main heap table, and that was just over 1GB largetoastdb=3D> SELECT format('%I.%I', t.schemaname, t.relname) as table_n= ame, pg_table_size(t.relid) AS table_size, sum(pg_relation_size(i.indexrelid)) AS total_index_size, pg_relation_size(t.relid) AS main_table_size, pg_relation_size(c.reltoastrelid) AS toast_table_size, pg_relation_size(oi.indexrelid) AS toast_index_size, t.n_live_tup AS row_count, count(*) AS index_count, array_to_json(array_agg(json_build_object(i.indexrelid::regclass, pg_relation_size(i.indexrelid))), true) AS index_info FROM pg_stat_user_tables t JOIN pg_stat_user_indexes i ON i.relid =3D t.relid JOIN pg_class c ON c.oid =3D t.relid LEFT JOIN pg_stat_sys_indexes AS oi ON oi.relid =3D c.reltoastrelid GROUP BY 1, 2, 4, 5, 6, 7 ORDER BY 2 DESC, 7 DESC LIMIT 25; =E2=94=8C=E2=94=80[ RECORD 1 ]=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=AC=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=90 =E2=94=82 table_name =E2=94=82 public.just_toasted = =E2=94=82 =E2=94=82 table_size =E2=94=82 56718835712 = =E2=94=82 =E2=94=82 total_index_size =E2=94=82 230064128 = =E2=94=82 =E2=94=82 main_table_size =E2=94=82 1191559168 = =E2=94=82 =E2=94=82 toast_table_size =E2=94=82 54613336064 = =E2=94=82 =E2=94=82 toast_index_size =E2=94=82 898465792 = =E2=94=82 =E2=94=82 row_count =E2=94=82 5625234 = =E2=94=82 =E2=94=82 index_count =E2=94=82 1 = =E2=94=82 =E2=94=82 index_info =E2=94=82 [{"just_toasted_pkey" : 230064128}] = =E2=94=82 =E2=94=94=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=B4=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=98 On Thu, Nov 13, 2025 at 9:24=E2=80=AFPM Hannu Krosing w= rote: > > Ran another test with a 53GB database where most of the data is in TOAST > > CREATE TABLE just_toasted( > id serial primary key, > toasted1 char(2200) STORAGE EXTERNAL, > toasted2 char(2200) STORAGE EXTERNAL, > toasted3 char(2200) STORAGE EXTERNAL, > toasted4 char(2200) STORAGE EXTERNAL > ); > > and the toast fields were added in somewhat randomised order. > > Here the results are as follows > > Parallelism | chunk size (pages) | time (sec) > 1 | - | 240 > 2 | 1000 | 129 > 4 | 1000 | 64 > 8 | 1000 | 36 > 16 | 1000 | 30 > > 4 | 9095 | 78 > 8 | 9095 | 42 > 16 | 9095 | 42 > > The reason larger chunk sizes performed worse was that they often had > one or two stragglers left behind which > > Detailed run results below: > > hannuk@pgn2:~/work/postgres/src/bin/pg_dump$ time ./pg_dump > --format=3Ddirectory -h 10.58.80.2 -U postgres -f > /tmp/ltoastdb-1-plain.dump largetoastdb > real 3m59.465s > user 3m43.304s > sys 0m15.844s > > hannuk@pgn2:~/work/postgres/src/bin/pg_dump$ time ./pg_dump > --format=3Ddirectory -h 10.58.80.2 -U postgres > --huge-table-chunk-pages=3D9095 -j 4 -f /tmp/ltoastdb-4.dump > largetoastdb > real 1m18.320s > user 3m49.236s > sys 0m19.422s > > hannuk@pgn2:~/work/postgres/src/bin/pg_dump$ time ./pg_dump > --format=3Ddirectory -h 10.58.80.2 -U postgres > --huge-table-chunk-pages=3D9095 -j 8 -f /tmp/ltoastdb-8.dump > largetoastdb > real 0m42.028s > user 3m55.299s > sys 0m24.657s > > hannuk@pgn2:~/work/postgres/src/bin/pg_dump$ time ./pg_dump > --format=3Ddirectory -h 10.58.80.2 -U postgres > --huge-table-chunk-pages=3D9095 -j 16 -f /tmp/ltoastdb-16.dump > largetoastdb > real 0m42.575s > user 4m11.011s > sys 0m26.110s > > hannuk@pgn2:~/work/postgres/src/bin/pg_dump$ time ./pg_dump > --format=3Ddirectory -h 10.58.80.2 -U postgres > --huge-table-chunk-pages=3D1000 -j 16 -f /tmp/ltoastdb-16-1kpages.dump > largetoastdb > real 0m29.641s > user 6m16.321s > sys 0m49.345s > > hannuk@pgn2:~/work/postgres/src/bin/pg_dump$ time ./pg_dump > --format=3Ddirectory -h 10.58.80.2 -U postgres > --huge-table-chunk-pages=3D1000 -j 8 -f /tmp/ltoastdb-8-1kpages.dump > largetoastdb > real 0m35.685s > user 3m58.528s > sys 0m26.729s > > hannuk@pgn2:~/work/postgres/src/bin/pg_dump$ time ./pg_dump > --format=3Ddirectory -h 10.58.80.2 -U postgres > --huge-table-chunk-pages=3D1000 -j 4 -f /tmp/ltoastdb-4-1kpages.dump > largetoastdb > real 1m3.737s > user 3m50.251s > sys 0m18.507s > > hannuk@pgn2:~/work/postgres/src/bin/pg_dump$ time ./pg_dump > --format=3Ddirectory -h 10.58.80.2 -U postgres > --huge-table-chunk-pages=3D1000 -j 2 -f /tmp/ltoastdb-2-1kpages.dump > largetoastdb > real 2m8.708s > user 3m57.018s > sys 0m18.499s > > On Thu, Nov 13, 2025 at 7:39=E2=80=AFPM Hannu Krosing = wrote: > > > > Going up to 16 workers did not improve performance , but this is > > expected, as the disk behind the database can only do 4TB/hour of > > reads, which is now the bottleneck. (408/352/*3600 =3D 4172 GB/h) > > > > $ time ./pg_dump --format=3Ddirectory -h 10.58.80.2 -U postgres > > --huge-table-chunk-pages=3D131072 -j 16 -f /tmp/parallel16.dump largedb > > real 5m44.900s > > user 53m50.491s > > sys 5m47.602s > > > > And 4 workers showed near-linear speedup from single worker > > > > hannuk@pgn2:~/work/postgres/src/bin/pg_dump$ time ./pg_dump > > --format=3Ddirectory -h 10.58.80.2 -U postgres > > --huge-table-chunk-pages=3D131072 -j 4 -f /tmp/parallel4.dump largedb > > real 10m32.074s > > user 38m54.436s > > sys 2m58.216s > > > > The database runs on a 64vCPU VM with 128GB RAM, so most of the table > > will be read in from the disk > > > > > > > > > > > > > > On Thu, Nov 13, 2025 at 7:02=E2=80=AFPM Hannu Krosing wrote: > > > > > > I just ran a test by generating a 408GB table and then dumping it bot= h ways > > > > > > $ time pg_dump --format=3Ddirectory -h 10.58.80.2 -U postgres -f > > > /tmp/plain.dump largedb > > > > > > real 39m54.968s > > > user 37m21.557s > > > sys 2m32.422s > > > > > > $ time ./pg_dump --format=3Ddirectory -h 10.58.80.2 -U postgres > > > --huge-table-chunk-pages=3D131072 -j 8 -f /tmp/parallel8.dump largedb > > > > > > real 5m52.965s > > > user 40m27.284s > > > sys 3m53.339s > > > > > > So parallel dump with 8 workers using 1GB (128k pages) chunks runs > > > almost 7 times faster than the sequential dump. > > > > > > this was a table that had no TOAST part. I will run some more tests > > > with TOASTed tables next and expect similar or better improvements. > > > > > > > > > > > > On Wed, Nov 12, 2025 at 1:59=E2=80=AFPM Ashutosh Bapat > > > wrote: > > > > > > > > Hi Hannu, > > > > > > > > On Tue, Nov 11, 2025 at 9:00=E2=80=AFPM Hannu Krosing wrote: > > > > > > > > > > Attached is a patch that adds the ability to dump table data in m= ultiple chunks. > > > > > > > > > > Looking for feedback at this point: > > > > > 1) what have I missed > > > > > 2) should I implement something to avoid single-page chunks > > > > > > > > > > The flag --huge-table-chunk-pages which tells the directory forma= t > > > > > dump to dump tables where the main fork has more pages than this = in > > > > > multiple chunks of given number of pages, > > > > > > > > > > The main use case is speeding up parallel dumps in case of one or= a > > > > > small number of HUGE tables so parts of these can be dumped in > > > > > parallel. > > > > > > > > Have you measured speed up? Can you please share the numbers? > > > > > > > > -- > > > > Best Wishes, > > > > Ashutosh Bapat