Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1rsRRd-0075G7-BC for pgsql-general@arkaria.postgresql.org; Thu, 04 Apr 2024 18:04:13 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1rsRRc-00AoRB-Ab for pgsql-general@arkaria.postgresql.org; Thu, 04 Apr 2024 18:04:12 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1rsRRc-00AoR2-0D for pgsql-general@lists.postgresql.org; Thu, 04 Apr 2024 18:04:12 +0000 Received: from mail-ot1-x32a.google.com ([2607:f8b0:4864:20::32a]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1rsRRZ-000e3A-GB for pgsql-general@postgresql.org; Thu, 04 Apr 2024 18:04:11 +0000 Received: by mail-ot1-x32a.google.com with SMTP id 46e09a7af769-6e89c87a72eso924219a34.0 for ; Thu, 04 Apr 2024 11:04:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1712253847; x=1712858647; darn=postgresql.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=2jre0+fxi/5qSnlB9CG+KW+jq71aOAT5YMaxvtI4N5c=; b=NMUH/z4ozbLN1Ih5AcsQ0euUDkrtQLX1iKd+OA0zhhDi4FBFobXwjYzSnU+d0MTXJN lp5UMhqnKbSBvAW0VST4XUG/PwLI5jLQi64V7mjaC8uO+3uYz11MFj4SGE8hAogfOLDi EOIErESzFQ+Hn/IIRVAmfpO7mrDjXXTWZCG2dCUV7WtFv88vW0rm9DcxAxHiZvEXcGjp EeYxFDn/IVMCbtaMRsEMThh43e91VWAwI5MvjAO+ztJsdSmyLAz3fcgsd/B7PnGnGUkZ Cz64P2OHnOxpGpb/nHrgLISEUWi8TOBR0vy+4SWMQBrhFRZjBVYmjXKqtIWXZPcviRwY 5Z5w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712253847; x=1712858647; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=2jre0+fxi/5qSnlB9CG+KW+jq71aOAT5YMaxvtI4N5c=; b=NCDTsQr05cqeHQihYtqLl11oW8RMtnNopaZZEqZFyWExzcMXsylz7qVCGV8i1xCf46 s2vNrAc52pmVQ5laeoloPZtNOluFdRsBYj614mke072k16oc3hqC0lIaCVHTh0Voph8I fV96dZRTwbZhdjiWRHx0UbnOmjtD15Js2xhRF/TwYiYtTrmugLc0VdSimXmjK6dipSUi /PYpR7gg4jU0TNqvb5Ditv7jdlSs3sFYySuVtDJr6imAVCYwqQtTLzevYxzWoJDyV4O7 wgvI3OHxa/17OyLfhzRZA6/hOpH6bB6/0zGWAmF79vw10J9oVnBs4MiMnMv+HjXXzoQT qF8g== X-Gm-Message-State: AOJu0YwLXGeRAieI5RdXrkeuM+go3vV9vGyexJ8UpvxWl/ruASGPvYDc z81rJ+QcZWgOVB7rQN8k2maW2A02uGzeQsPt+4oH/38hB6DqFSqXZHXBquFp+6/vMD68yvihHnd vLEn+LhBz1XfdNOVb/Xixegk2Cihv+hxdQd7CPg== X-Google-Smtp-Source: AGHT+IGIUxm+tHXFIQ7UWvuBO+SvaunBf1aFS4H+XuWuoKheEML6Tbfa1eLPRl+o1TCMrmKwcVSTyspORkf6KFWic+g= X-Received: by 2002:a05:6808:19a5:b0:3c5:d51d:788b with SMTP id bj37-20020a05680819a500b003c5d51d788bmr3778858oib.59.1712253847343; Thu, 04 Apr 2024 11:04:07 -0700 (PDT) MIME-Version: 1.0 From: Carl L Date: Thu, 4 Apr 2024 11:03:56 -0700 Message-ID: Subject: Multiple COPY statements for one table vs one for ~half a billion records To: pgsql-general@postgresql.org Content-Type: multipart/alternative; boundary="000000000000561d810615492a91" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000561d810615492a91 Content-Type: text/plain; charset="UTF-8" Hi there, I have around half a billion records that are being generated from a back end that are split into 80 threads (one per core) and I'm performing a copy from memory ( from stdin binary) into Postgres from each of these threads - i.e. there are 80 COPY statements being generated for one table that are running concurrently. I can see each of the Postgres processes sitting at around 15% CPU usage. These are all also in the same transaction - I am the only one connected, so it's not an issue to hold a big transaction. I can see that many of the Postgres threads have a wait event "LWLock: BufferContent", which I assume means that they are waiting for each other before they can write to the table. Therefore, would it be more efficient to combine all of these and put them into one COPY statement? Thanks! --000000000000561d810615492a91 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi there,

I have around half a billion records that= are being generated from a back end that are split into 80 threads (one pe= r core) and I'm performing a copy from memory ( from stdin binary) into= Postgres from each of these threads - i.e. there are 80 COPY statements be= ing generated for one table that are running concurrently. I can see each o= f the Postgres processes sitting at around 15% CPU usage.

These are = all also in the same transaction - I am the only one connected, so it's= not an issue to hold a big transaction.

I can see that many of the = Postgres threads have a wait event "LWLock: BufferContent", which= I assume means that they are waiting for each other before they can write = to the table. Therefore, would it be more efficient to combine all of these= and put them into one COPY statement?

Thanks!
--000000000000561d810615492a91--