Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vsEHK-00Bxmd-2l for pgsql-hackers@arkaria.postgresql.org; Tue, 17 Feb 2026 06:09:47 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vsEHJ-008T7d-2D for pgsql-hackers@arkaria.postgresql.org; Tue, 17 Feb 2026 06:09:45 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vsEG3-008Pfn-2D for pgsql-hackers@lists.postgresql.org; Tue, 17 Feb 2026 06:08:27 +0000 Received: from mail-dy1-x1341.google.com ([2607:f8b0:4864:20::1341]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1vsEG1-000000010o6-017z for pgsql-hackers@lists.postgresql.org; Tue, 17 Feb 2026 06:08:26 +0000 Received: by mail-dy1-x1341.google.com with SMTP id 5a478bee46e88-2baab3137bcso3432414eec.0 for ; Mon, 16 Feb 2026 22:08:25 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1771308505; cv=none; d=google.com; s=arc-20240605; b=CK62s2H3uHM8j4LwSG+wRVN7JGV2gOjusFyUCgH/iYkYlFD9hU4JNxMACdGTYMfBWV +u2UXzjBYK6+u4CnZVoAXw7TqnNYbgiAuFC81Y7B6FOSQtYuDRuUWplbdfkpNbmOTWzS CeSrr4BtCRV00oSmnV9uZoXaNMnQdnOFcp5nchtw4G6A+VEvrQU8Lm86NzsMXhnedExL 4eQn9pULQx/cbvqzLJnik0Ss563gtRoyoLgPwfnu8rR00os7kWUGePQNVJ4DWCGl/1dL dBqoWVfqWup+LJbM32SmYjLkwrnv4cSHUUyXTnoJD00iivS6agozBXval1VJzZygecIp xXjA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:dkim-signature; bh=hGHAJpHilFjOFVz/cIrp5djI8NYLnHysXkaY5trviEY=; fh=oRUcr63FoJkn9JaeQVAWBAcajKIvSPJENhpmoslvVcs=; b=k/4uGOfJEz7whlZwKzt/82zfPm+I404IyKvW8PdzvLDQ5yoojL7IvvkmqV4xTP2A+Y 533o+gwaW3aohvRfmNqB4mDwrU6C5pppB6R1SjxkylWAtv0Sseb3SMY0vM6dw0JzKqin qjRx2WtY+WEkVfF0dbYI4TVCTkCOdJEYE0PgcOYGbvcW1hXM1vJ4Ol0vljLgUhoJcrat 6HXaFEi3zxnyHFwQ+PWX4nNZNO2F/uGwgDw0ks3YhvMwLavXzvnJJp3qeMfb9OH3jcuw zXzzwSQM4BgxkQALmzysZD6oTSYh139EVlBS0PqCxTepXaQXn7uOAn+BseiOILU3wTtN ezMA==; darn=lists.postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1771308505; x=1771913305; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=hGHAJpHilFjOFVz/cIrp5djI8NYLnHysXkaY5trviEY=; b=LYLDNainINLnk/TILfxRRGxdPPTLlKKadqr7SC38SHKPhwvQDyBaKBmelt7DrWwfK3 vGQHRqagH4F8nF56lfbUcj6CV66uc0Ih4HI8mN2jiWKnniiFX4YVP7lsLL1sXYtK0Vx2 +hHZeYfmA8pyBicuNnhvSyGad+YpyuxTwsQXH5/VbFra/gI1KbUBYO28DVZ1co3OllK+ ockhTLLE9zMCiMCAYFo1U+wiKSgWX5OZnhr1HQXWpkuLMfMMGDTnyyO5UrFtSJC74/k+ ec0HVnQ0d3Vx7W88ip8DVsUakd6oDObESDFdJVXtnNe0xgvRs9+VX+dAL6WaIi31sT+j RMEA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1771308505; x=1771913305; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=hGHAJpHilFjOFVz/cIrp5djI8NYLnHysXkaY5trviEY=; b=laQZvmsdzp7Y3wR72XxypqhCwdoHFgiDMjJOWf1oT1zR2g1yKY10W+Vgy4bYxrbvE1 x38KjH83RaoNY6Cqpf6Vgqtpn5CkNwMOuXFJfu+fdGQES22lWGSTlqfTi0NKFtS5KR/u 7nHFf1nDxs2szfDtooV6t0Xjt7gw42fclDPG0yEwIIrJtNs5O4rIRmp+84qIr0AyWHNL SKu1tBVOJiYf7NrOGe7Vk/iIfdMsX56umyEYcGSPaLF3igvGUkUzJs2TKSReZ7ncMmqq QQqcsaRsyngWn2BfW6BesCiM1qU/Lbs4UKDNLNpbEtuUsW74FsLcxnBPM9eO+ab0U+2B rVrQ== X-Forwarded-Encrypted: i=1; AJvYcCU5nyF74VcRLaEO918BjXodEpwvERj8U2PksE39gRFtgBPZOfyLW8L7ZkQDlkmIAvIwRmUaNP/yb1Qc3CQ9@lists.postgresql.org X-Gm-Message-State: AOJu0Yx92/bqrIzjM5ZSj03qQQbr+MdZXR5axa1Ct73Vj+4K12Pak4ac U4hrjmb1i2ZLo5nGiJhANCM2dosU5Jw3EuD0ONkIF5J7L6K2w91mgTQodvfFvQiJvIH0rnFcPir 7qGYOAeBLs9RFEfwJh3/8IMakIUO8ixc= X-Gm-Gg: AZuq6aI28rP5K9QZ9/iVh3epyhc6BRIICmVlvlUcM0XBNBAfcX3pP/FyxCs42cghEWS WJuIKWPoGm7dMmtaNaCYAKHddOZnxWtNDV22lbyECREic+2pCxlttnSWMKoXaAq90FdWcy/k5LY 6dM9BftSUVuDMxd/S9ocTaifsI7UK9k+CsrMEJgpE604ejPFP5RovfYZDh+6erplZDa9LMCQg0W v6WgooIJdsKGphbGmUo4XBPzrZGO4Vjlomu2VAso6NI5mHxmXalRkNRSrM8B07u/xnJx8r0bq9R 1TEH8sI= X-Received: by 2002:a05:7300:d513:b0:2ba:9115:2fab with SMTP id 5a478bee46e88-2bac9356416mr3357353eec.12.1771308504551; Mon, 16 Feb 2026 22:08:24 -0800 (PST) MIME-Version: 1.0 References: <4c1d0b97-a5f8-472c-afdd-bdeb09b93f33@gmail.com> In-Reply-To: From: lakshmi Date: Tue, 17 Feb 2026 11:41:13 +0530 X-Gm-Features: AaiRm51rAX3q0X81_TmJvFdKpDi2WgJ5PX57PxcfOBVl57WQBfeBiLBI6Tl-4-g Message-ID: Subject: Re: parallel data loading for pgbench -i To: "Hayato Kuroda (Fujitsu)" Cc: Mircea Cadariu , PostgreSQL Hackers , "tomas@vondra.me" Content-Type: multipart/alternative; boundary="0000000000003396a8064afee604" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --0000000000003396a8064afee604 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi Mircea, Hayato, I ran a few more tests on 19devel ,focusing on the partitioned case to better understand the performance behavior. For scale 500, the serial initialization on my system takes around 34.3 seconds. Using parallel initialization without partitions (-j 10) makes the client-side data generation noticeably faster,But the overall runtime ends up slightly higher because the vacuum phase becomes much longer. However,when running with partitions(pgbench -i -s 500 --partitions=3D10 -j 10),the total runtime drops to about 21.9 seconds, and the vacuum cost is much smaller.I also verified that the row counts are correct in all cases ,and regression tests still pass locally. So it looks like the main benefit of parallel initialization shows up clearly in the partitioned setup,which matches the expectations discussed earlier.Just sharing these observations in case they are useful for the ongoing review. Thanks again for the work on this patch. Best regards, Lakshmi On Wed, Feb 11, 2026 at 5:53=E2=80=AFPM Hayato Kuroda (Fujitsu) < kuroda.hayato@fujitsu.com> wrote: > Dear Mircea, > > Thanks for the proposal. I also feel the initalization wastes time. > Here are my initial comments. > > 01. > I found that pgbench raises a FATAL in case of -j > --partitions, is ther= e > a > specific reason? > If needed, we may choose the softer way, which adjust nthreads up to the > number > of partitions. -c and -j do the similar one: > > ``` > if (nthreads > nclients && !is_init_mode) > nthreads =3D nclients; > ``` > > 02. > Also, why is -j accepted in case of non-partitions? > > 03. > Can we port all validation to main()? I found initPopulateTableParallel() > has > such a part. > > 04. > Copying seems to be divided into chunks per COPY_BATCH_SIZE. Is it really > essential to parallelize the initialization? I feel it may optimize even > serialized case thus can be discussed independently. > > 05. > Per my understanding, each thread creates its tables, and all of them are > attached to the parent table. Is it right? I think it needs more code > changes, and I am not sure it is critical to make initialization faster. > > So I suggest using the incremental approach. The first patch only > parallelizes > the data load, and the second patch implements the CREATE TABLE and ALTER > TABLE > ATTACH PARTITION. You can benchmark three patterns, master, 0001, and > 0001 + 0002, then compare the results. IIUC, this is the common approach = to > reduce the patch size and make them more reviewable. > > 06. > Missing update for typedefs.list. WorkerTask and CopyTarget can be added > there. > > 07. > Since there is a report like [1], you can benchmark more cases. > > [1]: > https://www.postgresql.org/message-id/CAEvyyTht69zjnosPjziW6dqNLqs-n6eKia= 2vof108zQp1QFX%3DQ%40mail.gmail.com > > Best regards, > Hayato Kuroda > FUJITSU LIMITED > --0000000000003396a8064afee604 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi Mircea, Hayato,
I ran a few more te= sts on 19devel=C2=A0,focusing on the partitioned case to better understand = the performance behavior.

For scale 500, the serial initialization o= n my system takes around 34.3 seconds. Using parallel initialization withou= t partitions (-j 10) makes the client-side data generation noticeably faste= r,But the overall runtime ends up slightly higher because the vacuum phase = becomes much longer.

However,when running with partitions(pgbench -i= -s 500 --partitions=3D10 -j 10),the total runtime drops to about 21.9 seco= nds, and the vacuum cost is much smaller.I also verified that the row count= s are correct in all cases ,and regression tests still pass locally.
So it looks like the main benefit of parallel initialization shows up clea= rly in the partitioned setup,which matches the expectations discussed earli= er.Just sharing these observations in case they are useful for the ongoing = review.
Thanks again for the work on this patch.

Best regards,Lakshmi

On Wed, Feb 11, 2026 at 5:53=E2=80=AFPM H= ayato Kuroda (Fujitsu) <kur= oda.hayato@fujitsu.com> wrote:
Dear Mircea,

Thanks for the proposal. I also feel the initalization wastes time.
Here are my initial comments.

01.
I found that pgbench raises a FATAL in case of -j > --partitions, is the= re a
specific reason?
If needed, we may choose the softer way, which adjust nthreads up to the nu= mber
of partitions. -c and -j do the similar one:

```
if (nthreads > nclients && !is_init_mode)
nthreads =3D nclients;
```

02.
Also, why is -j accepted in case of non-partitions?

03.
Can we port all validation to main()? I found initPopulateTableParallel() h= as
such a part.

04.
Copying seems to be divided into chunks per COPY_BATCH_SIZE. Is it really essential to parallelize the initialization? I feel it may optimize even serialized case thus can be discussed independently.

05.
Per my understanding, each thread creates its tables, and all of them are attached to the parent table. Is it right? I think it needs more code
changes, and I am not sure it is critical to make initialization faster.
So I suggest using the incremental approach. The first patch only paralleli= zes
the data load, and the second patch implements the CREATE TABLE and ALTER T= ABLE
ATTACH PARTITION. You can benchmark three patterns, master, 0001, and
0001 + 0002, then compare the results. IIUC, this is the common approach to=
reduce the patch size and make them more reviewable.

06.
Missing update for typedefs.list. WorkerTask and CopyTarget can be added th= ere.

07.
Since there is a report like [1], you can benchmark more cases.

[1]: https://www.postgresql.org/message-id/CAEvyyTht69zjnosPjziW6dq= NLqs-n6eKia2vof108zQp1QFX%3DQ%40mail.gmail.com

Best regards,
Hayato Kuroda
FUJITSU LIMITED
--0000000000003396a8064afee604--