Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1smIs2-004jFI-30 for pgsql-general@arkaria.postgresql.org; Thu, 05 Sep 2024 20:14:22 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1smIs0-00GwSX-M9 for pgsql-general@arkaria.postgresql.org; Thu, 05 Sep 2024 20:14:21 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1smIs0-00GwSP-AE for pgsql-general@lists.postgresql.org; Thu, 05 Sep 2024 20:14:20 +0000 Received: from mail-yw1-x1135.google.com ([2607:f8b0:4864:20::1135]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1smIru-000Kyh-Jx for pgsql-general@lists.postgresql.org; Thu, 05 Sep 2024 20:14:20 +0000 Received: by mail-yw1-x1135.google.com with SMTP id 00721157ae682-6d3c10af2efso12315707b3.0 for ; Thu, 05 Sep 2024 13:14:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1725567253; x=1726172053; darn=lists.postgresql.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=J++nuuAvFzunpSNz4j6sQfyhP0OomVNwG7LB+PCXx04=; b=nKc4YvIm8HhIK3oAcRR7hs8nqAPJ1cmOq4fJwVcJLJ3IMr3adwsdrDM9QoZkWdbI/X Ba6sULfMxkg7bejptzLrk2mven25aI1EJ0SKgLnH08hIk5mvGpxSSoTemLewoZ3Ewd27 /DE+wR3uC8psTYYLc/SP9PV9riptBGl4/k0AxqfECny8c0f6KKdKFF70XHfsUcUVxR+8 a0EFfSrvP03e3Kkl5YO48grOA9hJJpjwJVPbldYefypt7JX+OYllL8X9tUIr1perwL+q K+cLPQsY3yteyT7asX6NZNz0a6EDTVEYSQUfYJM8Ogrg3cBMuGuhlOU/7F0BiZubIspU v53A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725567253; x=1726172053; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=J++nuuAvFzunpSNz4j6sQfyhP0OomVNwG7LB+PCXx04=; b=kwvVAmKaJ4q0srHFd1n71Yr26nfxePykVxY3OQrz043DKoR/3//WfjZx2PC2TxuWW7 XVc1SKVv2BT3a8+hRBUmW58FPz7SinjnaLgEXRSErowR1DlJmpKss1Snd3OkGLzkbo8Z nrC+PmVcq265o5rGw/bDfPuWTPbPo3XmDVkcUrvbgdLhqNbwqiipnPIYsnQLuRQkNLLa vPQVMjfcstfcstYh/mp9xOQi2gjmsrcvULRTJOzpZYUnyQFC/6CEN0xVtkftGFEUchOO QkpWB89boOwombXZ4HjazO6iyDahOvIihwIvo9hmGqHwSew/droDbQKaVlINL22IDLq0 +SCQ== X-Gm-Message-State: AOJu0YyCl7qPKCXugt37jp2ds3pvo4v+pXdi6A00jf3tw2IvFYrwR6E9 RxV/VrDO6AeCbwXngXCr6dAndVo4gj98LAsaM+prVfVCA8fco8ivdSz7DvcKnH5tbycEgPb50SR kJt9sfSOihkvRZaWHt+U+HvXAjY6yrA== X-Google-Smtp-Source: AGHT+IEH6p7S6TbmySwe2tqoonrTge+WM5xn0iIeOXUw+JCwFa5MJ42D45juUbmUcYjW4NXCX6hiApYTL1tYOZxzXgA= X-Received: by 2002:a05:690c:660f:b0:6bd:e76d:19ea with SMTP id 00721157ae682-6db44d69bdcmr4381737b3.8.1725567252282; Thu, 05 Sep 2024 13:14:12 -0700 (PDT) MIME-Version: 1.0 From: Lok P Date: Fri, 6 Sep 2024 01:44:00 +0530 Message-ID: Subject: Faster data load To: pgsql-general Content-Type: multipart/alternative; boundary="0000000000001bc936062164efbb" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --0000000000001bc936062164efbb Content-Type: text/plain; charset="UTF-8" Hi, We are having a requirement to create approx 50 billion rows in a partition table(~1 billion rows per partition, 200+gb size daily partitions) for a performance test. We are currently using ' insert into select.. From or ;' method . We have dropped all indexes and constraints First and then doing the load. Still it's taking 2-3 hours to populate one partition. Is there a faster way to achieve this? Few teammate suggesting to use copy command and use file load instead, which will be faster. So I wanted to understand, how different things it does behind the scenes as compared to insert as select command? As because it only deals with sql engine only. Additionally, when we were trying to create indexes post data load on one partition, it took 30+ minutes. Any possible way to make it faster? Is there any way to drive the above things in parallel by utilizing full database resources? It's postgres 15.4 Regards Lok --0000000000001bc936062164efbb Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi,=C2=A0

We are h= aving a requirement to create approx 50 billion rows in a partition table(~= 1 billion rows per partition, 200+gb size daily partitions) for a performan= ce test. We are currently using ' insert into <target table_partitio= n> select.. From <source_table_partition> or <some transformed = query>;' method . We have dropped all indexes and constraints First = and then doing the load. Still it's taking 2-3 hours to populate one pa= rtition. Is there a faster way to achieve this?=C2=A0

=
Few teammate suggesting to use copy command and us= e file load instead, which will be faster. So I wanted to understand, how d= ifferent things it does behind the scenes as compared to insert as select c= ommand? As because it only deals with sql engine only.=C2=A0

Additionally, when we were trying to = create indexes post data load on one partition, it took 30+ minutes. Any po= ssible way to make it faster?=C2=A0

Is there any way to drive the above things in parallel by utili= zing full database resources?

It's postgres 15.4

Regards
Lok
--0000000000001bc936062164efbb--