Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wCAVB-001j81-1t for pgsql-hackers@arkaria.postgresql.org; Mon, 13 Apr 2026 06:10:30 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wCAV8-004zrD-25 for pgsql-hackers@arkaria.postgresql.org; Mon, 13 Apr 2026 06:10:27 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wCAV8-004zr5-0U for pgsql-hackers@lists.postgresql.org; Mon, 13 Apr 2026 06:10:27 +0000 Received: from mail-dl1-x1243.google.com ([2607:f8b0:4864:20::1243]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1wCAV6-00000000kcI-3vIZ for pgsql-hackers@lists.postgresql.org; Mon, 13 Apr 2026 06:10:26 +0000 Received: by mail-dl1-x1243.google.com with SMTP id a92af1059eb24-12c287eb77fso3796886c88.1 for ; Sun, 12 Apr 2026 23:10:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1776060624; cv=none; d=google.com; s=arc-20240605; b=NsW3uL64YlsD6/AhCoMEiJkoZpCZ8nyo4a3k7vy/GvQIVE7mBpXH4p0fMlE8vKeX35 fxFai1smi6A5NVgChAsZvC4hArn2fBo9fr6KAzDG76/gOhAdmJiBtt8QOZCYpOUp++aU kkzVlb9nfahDg7BT6pC5mLLFxQWymCvG8+t0uS4vvBcL9KUHfA8OKJ899D2C3x5PZl6V FsQ+ETeyhqyp8b5cUb9P5pIIV1W1dJPDKtRen0v2NhaXEl8ZKVPb9jgRNJ8IxmqsfoY2 p5L+NjEYw9VS/rpe6S8qxyqL1wC2ARTXCX05YM25qsxmPYYtXkMGrYfQ246+R6Ad+5Rs etFw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:dkim-signature; bh=hQEQRa5Vy+lemC/oGpa5EPPwhKCAuvbRwtaVVTGodCI=; fh=u6PP+AXlZ4LyjyR3iege9mPP0saNXq6RbQz3o8Eiy1A=; b=NsiREADBJrP/abYKzxJ1jecWTKSxFEzH6TCpl8ndDazdajKlJVWykFEeMrCZZBh2dt d95Cma2DgEzvk/ueIbVyO19qwK7DG7mXfhIu8YqC5bydzuSLYbRBVKIOudNwEFAg56AC +bFRAIQ3uStzTtxupLB2Ube5+zzb1KPiUSoSWkDNLP+UKRY3W7okMsXr1m0Ta37Xa8zK KWxMpYzZNhp/4p/2ebVzl3xyOFIqHF1gKzu9ZwJi+meKhdPwVKotra2/gXtLkGs2MTM1 iGb5VhsOSekEb2sCQDaNpCEsoUsvN0Ubiv50ja2+YLs/zhMKDD2pzblJ1LfbmJRd6AR+ cVHQ==; darn=lists.postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1776060624; x=1776665424; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=hQEQRa5Vy+lemC/oGpa5EPPwhKCAuvbRwtaVVTGodCI=; b=gEeMY0xxu3vF3nH5IkqFRkBmcKGx9CeIvR84E6nYmvi+1ZJnn+PMJUAm1rDTf+uX4o H03VTIlC/YT8ID9UZEwsk0a1Xb0WKUqHk+qI4LhKo5U1Hfej6osDmwJOkwa8wF32ctsR FGxpVMrFuDrgrrOrRLyi9BEtEHtnXKmWc/ohb5gCjACDuGNH6CEAYOIgQVT5zjosQJcR Fyi2e+oiZOFn6SxA+O7PMtSO2BE106pnL2tNdekX8Yxxeolk3cl7jrmuT0um9OfL4HlD or9QB+ayTC5FvcnBZ+9sLRKiY7EHO89ASmqI4jeZZOSrMaP9NBthaBCh9XO536Ll1BfP FCxQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1776060624; x=1776665424; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=hQEQRa5Vy+lemC/oGpa5EPPwhKCAuvbRwtaVVTGodCI=; b=If0YoA73FxeRKWlYEqGSagushS5fT3l9QId3VscsQoU0ktXktjOawAuSNlyd4UROCs VwcTFNUR4jgOdKAy2e91thFvWtHMKXOUaYqE1Io9mjtkw3SkxghunSeWMfG2rTidYaxW xlFrrWNHNCZ3bGnkGqnMxC44uNY8BcEv9iBvLHX+YV5v7nIpjy3HLUQyvSZwVdmaAb6d 8gL0Q33kkjo8bZWxI8hz/HSr2H/sGKKR6sdD7gSsGSot+GPpvgY0cCmQZFzea0UoXmNH ExLTaOSohOEUy2nOy/qzLRLGP5NlQY8mK6MMMLDvonkUIVa9kONkwuR4uXTI5hJAlSdD oIdQ== X-Forwarded-Encrypted: i=1; AFNElJ+trR5px9rBgBGlWyqaLTyTW0pnSy+MUzUigJjqwAxBu1zobU0bfdWJAh55bKBwjwi0DDamD/Wcmb2qjnfN@lists.postgresql.org X-Gm-Message-State: AOJu0YxCEXGqXXkHObyx936U/MJQ/2+3dgab0SyQYFmmQx/waGfqTZ8h Z7uiEBSTNHU/WFyrBTGCy7G97x4oXQxqNQUg+xSxPCap1HlFGr9BiiFl3iUhCU2IB7triLN+PE6 N4nG4s7xp5JW7HIYquVojFOhihgJ1twk= X-Gm-Gg: AeBDiesGcArNiAZV3oyaCfJnWkkb2nPllrdfa99ZEwPbgGPiWKw42sI/kvAhikO+p2e ThZXPCiJd6Rrk3881j4e2LEqAO7UMWyVaRujw6h01WfiP1C4HkuftQ5/e4e2+RBtqHFG/OqF8SQ LVT+5XZz3A4XDHXWmGNVtiUEo02v9E6vxvfAERuv175d044k8yJA44lN9wKJ9FEXw9/YoHNeGBb oVx8tEMDRg9ikw80MI93v9elxA/Z65Z1hDrjN/wJ3SPthI43yfJp7/Ztw6sJe0+r+G9T9lix2Og hEUp04TH/Nr0VMZSUg== X-Received: by 2002:a05:7022:ec11:b0:12a:6902:ddc6 with SMTP id a92af1059eb24-12c34dc294amr6916945c88.0.1776060624216; Sun, 12 Apr 2026 23:10:24 -0700 (PDT) MIME-Version: 1.0 References: <4c1d0b97-a5f8-472c-afdd-bdeb09b93f33@gmail.com> <10868918-cdf9-49dc-99af-8e8ccd6e368c@gmail.com> In-Reply-To: From: lakshmi Date: Mon, 13 Apr 2026 11:44:18 +0530 X-Gm-Features: AQROBzCIAdBNVMm4GPN0qzzt_7YO5Wf9AQ3_VOjAQNG_oSUQvyK84Lan9m0O_QI Message-ID: Subject: Re: parallel data loading for pgbench -i To: Mircea Cadariu Cc: Heikki Linnakangas , "Hayato Kuroda (Fujitsu)" , PostgreSQL Hackers , "tomas@vondra.me" Content-Type: multipart/alternative; boundary="0000000000009b26e6064f5156c8" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --0000000000009b26e6064f5156c8 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi Mircea, Heikki, I tested the v3 patch on 19devel with larger scale factors. The behavior looks much better now compared to the earlier versions. For scale 100 and 500, I see clear improvements in overall runtime, and for scale 2000, the total time is around 97s on my system. The loading phase now runs concurrently across workers, and I don=E2=80=99t= see the earlier serialization behavior anymore. The VACUUM phase also remains relatively small (~6s for scale 2000), which suggests that the previous overhead has been addressed. I also verified correctness, and the row counts match the expected values. Overall, the partitioned parallel approach looks solid and scales well in my tests. Thanks again for the work on this. Best regards, Lakshmi On Sat, Apr 11, 2026 at 12:07=E2=80=AFAM Mircea Cadariu wrote: > Hi, > > On 07/04/2026 10:00, Heikki Linnakangas wrote: > > > > This all makes more sense in the partitioned case. Perhaps we should > > parallelize only when partitioned are used, and use only one thread > > per partition. > > > Thanks for having a look. I attached v3 that parallelizes only the > partitioned case, one thread per partition. Results: > > > patch: > > pgbench -i -s 100 --partitions 10 > > done in 12.63 s (drop tables 0.05 s, create tables 0.01 s, client-side > generate 5.98 s, vacuum 1.63 s, primary keys 4.96 s). > > > master: > > pgbench -i -s 100 --partitions 10 > > done in 29.29 s (drop tables 0.00 s, create tables 0.02 s, client-side > generate 16.31 s, vacuum 7.78 s, primary keys 5.18 s). > > -- > Thanks, > Mircea Cadariu > --0000000000009b26e6064f5156c8 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

Hi Mircea, Heikki,

I tested the v3 patch on 19devel with larger scale factors.

The behavior looks much better now compared to the earlier versions. For= scale 100 and 500, I see clear improvements in overall runtime, and for sc= ale 2000, the total time is around 97s on my system.

The loading phase now runs concurrently across workers, and I don=E2=80= =99t see the earlier serialization behavior anymore.

The VACUUM phase also remains relatively small (~6s for scale 2000), whi= ch suggests that the previous overhead has been addressed.

I also verified correctness, and the row counts match the expected value= s.

Overall, the partitioned parallel approach looks solid and scales well i= n my tests.

Thanks again for the work on this.

Best regards,
Lakshmi


On Sat, Apr 11, 2026 at 12:07=E2=80=AFAM M= ircea Cadariu <cadariu.mirce= a@gmail.com> wrote:
Hi,

On 07/04/2026 10:00, Heikki Linnakangas wrote:
>
> This all makes more sense in the partitioned case. Perhaps we should <= br> > parallelize only when partitioned are used, and use only one thread > per partition.
>
Thanks for having a look. I attached v3 that parallelizes only the
partitioned case, one thread per partition. Results:


patch:

pgbench -i -s 100 --partitions 10

done in 12.63 s (drop tables 0.05 s, create tables 0.01 s, client-side
generate 5.98 s, vacuum 1.63 s, primary keys 4.96 s).


master:

pgbench -i -s 100 --partitions 10

done in 29.29 s (drop tables 0.00 s, create tables 0.02 s, client-side
generate 16.31 s, vacuum 7.78 s, primary keys 5.18 s).

--
Thanks,
Mircea Cadariu
--0000000000009b26e6064f5156c8--