Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1rsRcz-0076tU-DF for pgsql-general@arkaria.postgresql.org; Thu, 04 Apr 2024 18:15:57 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1rsRcy-00BAw9-I3 for pgsql-general@arkaria.postgresql.org; Thu, 04 Apr 2024 18:15:56 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1rsRcy-00BAuJ-70 for pgsql-general@lists.postgresql.org; Thu, 04 Apr 2024 18:15:56 +0000 Received: from mail-oa1-x2e.google.com ([2001:4860:4864:20::2e]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1rsRcv-000eBO-T9 for pgsql-general@postgresql.org; Thu, 04 Apr 2024 18:15:55 +0000 Received: by mail-oa1-x2e.google.com with SMTP id 586e51a60fabf-221a9e5484aso768071fac.0 for ; Thu, 04 Apr 2024 11:15:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1712254551; x=1712859351; darn=postgresql.org; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :from:to:cc:subject:date:message-id:reply-to; bh=hVkfkTSmIEzjQjgwUwROdjrcr9Xh0bVP8WelmBuhj0k=; b=byCbyGlHS4E2prJ+pnXQnibtjyZFoozeKGJDDoail+1gOFX3MZdpRh6ACCXRb/MA9F 6uY5tsYgxFQGeIayagozU7uHb5L6UD+tBzxxUwwelbsrKmCZhZie9jbIOPX0CThhybYT 66IVjD7++PLy/xPpNNZkEkbS2Ac7L/D6xMgSdmvWETb5svF085ifvkzR8NQeZMMqrvgT cgw0GNPwO3lAYsjE/kb54Co5rATlSdXH5G2jOmCVTExvinmnJ4bx+PjcDg6A4u9N8nt7 uITy6QCtERrWSqA/xFi1Pj7E0BC+Gxs8zScb7bOuwlTYsOcRXMRXGkfwcT720jaU+SAJ GyFQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712254551; x=1712859351; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=hVkfkTSmIEzjQjgwUwROdjrcr9Xh0bVP8WelmBuhj0k=; b=i8iGzEEYQ+eX5llpopVwC9p1veL6yJTB20zaXY2rAE7x76+3sguyfG/pI8VNvGmCtC vNwXNSevJXfbWAZ++eZ/3/3nHcvCfd1NI8idlA6aeBBfpftWg9dvUa1c5ouBx1FOZEBS Xz3INGWgwLS2WVH5pdkNkBJsWBBKkZfzz29FpJQ3VxeW1E+QUrASW4kYSEM//yJH2RwV dQ1b2UOxjuszpNQeVyMIdHEftflBb00KTs6MxWhMddvlhyrJPhQt73ndgJj0+07rEslk hWY5UgK5niLYuYC/5cI/SqDX2ko2qW0rWaXRUc5yUBhpcBf/P7+NYPAgzajy6aDI14zc 4l+Q== X-Gm-Message-State: AOJu0YxUT35G2S2HngKTUwRE4Xtt2al2br5ZJTLN5WhctfRHXZ8HOPhE fWiwmMjRkubC9h9KzxRzm9/DTz0BGc42NP/BbgQPaEnT7kmawvDADw9sfeFSRWxULhmcgLpAhKU RUqlruw9yIs6UAPb6M83p4rOyz2EFny6J X-Google-Smtp-Source: AGHT+IFt6+nu1E2TvVNN5RlMPP5vOQxACFhcdWlEVsPDYyo7d3D8OG2D2S8JyjuVXzfQhaFlzuMKgOhUAfs37TsTywM= X-Received: by 2002:a05:6870:5d92:b0:22e:bb24:5c79 with SMTP id fu18-20020a0568705d9200b0022ebb245c79mr2547023oab.36.1712254551558; Thu, 04 Apr 2024 11:15:51 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Ron Johnson Date: Thu, 4 Apr 2024 14:15:40 -0400 Message-ID: Subject: Re: Multiple COPY statements for one table vs one for ~half a billion records To: pgsql-general Content-Type: multipart/alternative; boundary="0000000000004f9141061549544f" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --0000000000004f9141061549544f Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Thu, Apr 4, 2024 at 2:04=E2=80=AFPM Carl L wrote= : > Hi there, > > I have around half a billion records that are being generated from a back > end that are split into 80 threads (one per core) and I'm performing a co= py > from memory ( from stdin binary) into Postgres from each of these threads= - > i.e. there are 80 COPY statements being generated for one table that are > running concurrently. I can see each of the Postgres processes sitting at > around 15% CPU usage. > Is the target table partitioned in the same way that the input data is split? That would make things faster... > These are all also in the same transaction - I am the only one connected, > so it's not an issue to hold a big transaction. > Unless it fills up your WAL partition. > --0000000000004f9141061549544f Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Thu, Apr 4, 2024 at 2:04=E2=80=AFPM Ca= rl L <cllewellyno@gmail.com= > wrote:
Hi there,

I have around half= a billion records that are being generated from a back end that are split = into 80 threads (one per core) and I'm performing a copy from memory ( = from stdin binary) into Postgres from each of these threads - i.e. there ar= e 80 COPY statements being generated for one table that are running concurr= ently. I can see each of the Postgres processes sitting at around 15% CPU u= sage.

Is the target table partiti= oned in the same way that the input data is split?

That would make things faster...
=C2=A0
These are all also in the= same transaction - I am the only one connected, so it's not an issue t= o hold a big transaction.

Unless = it fills up your WAL partition.
--0000000000004f9141061549544f--