Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1spJNJ-000YzH-4m for pgsql-general@arkaria.postgresql.org; Sat, 14 Sep 2024 03:23:06 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1spJMI-001pGo-My for pgsql-general@arkaria.postgresql.org; Sat, 14 Sep 2024 03:22:02 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1spJMI-001pBX-9H for pgsql-general@lists.postgresql.org; Sat, 14 Sep 2024 03:22:02 +0000 Received: from mail-io1-xd2e.google.com ([2607:f8b0:4864:20::d2e]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1spJMF-0013kb-GQ for pgsql-general@lists.postgresql.org; Sat, 14 Sep 2024 03:22:01 +0000 Received: by mail-io1-xd2e.google.com with SMTP id ca18e2360f4ac-82ceab75c27so84643439f.1 for ; Fri, 13 Sep 2024 20:21:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1726284118; x=1726888918; darn=lists.postgresql.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=YqS183Gn/EEfDRnVEXBKOn/CKyOFPm8Fr9zCe1jAaxc=; b=mIAGGguoGO32jT8utr7G0mBHotqJVwA1+T5YwRThUNdkGufZHON62Ggv3WSFJFDlX6 cZI31uVFT4pIq3UV4BejXpASrE3hEtKOG6lg9I9m2jsPc6f2gDZl9GAqoUMWgeEGKBsx GnXXnnV0NOpEtJTNZwlKEyIDlNkU6JUDWHuR2oxV8UNLqtfJgSldyZT2JJOl9KJ+Elac UZE0xqOE4RaRN/Obu9a60x3JCGERQYcRQPLpwq2ajPfEppg3WcvgL5gM9cheqtJA+Nfs qzKfWt2CuH+1/FVO+7jOzmpvlpRZBSytHzi24W1VoQO+TQ5y9IKxf7iOq4lfdI+K7sGN eATw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726284118; x=1726888918; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=YqS183Gn/EEfDRnVEXBKOn/CKyOFPm8Fr9zCe1jAaxc=; b=EidCJ5mk9v3kLzl3NA5Ik1N4ghArencKbNLcYzHIEKM/aNiUfnZrOWQXEOJwfEdNrX O98Qn1uoQ/yPGARn9LcKFU7/iiMC+YvYy1PJmE5t1dkZ3bqhP6t8diuomjvZKunIK+He /iyYIC9AcCrxVZ+wd+UL74NsqpjtyOaVT1LGrNODv+nu820475ZHYtfXrKn/W0x3AnPi cNqhLXXAG40RdU0Ma6ZU9DiEChrnQY+JgXsFW712whc4F+4i91JwHsX2e8eESyfjdlqW 2i/Q0O/Zup8M2te6kT1sQhwocSHNrkiO0PDqjnknPj0strLLnGWczMCz+fMfcJQU07aH TlHw== X-Gm-Message-State: AOJu0YycJUIoS3t0Au91Ns+sesOY8dDFvb+5V0Iz9veOCxgW7o5rzrN3 tebcBxL5ND5do0bYrUgsayF9Et7vTyd0SGzy26CGJheuxDze4eLFDQWCxthsm68naj29jncIhft 36iGaR2EikfZY3WrjwLUMIBHYok5PxK7i X-Google-Smtp-Source: AGHT+IHBrbGaafy7XLONBrBaTNWojYX72gHCELGWyYWW7YtdtwfgQVCfJYQDuVnFQXTTcnG0cnq2P6ht0aVUCVt45ro= X-Received: by 2002:a05:6870:169a:b0:278:1d21:2b34 with SMTP id 586e51a60fabf-27c68a2620amr3699512fac.27.1726283626350; Fri, 13 Sep 2024 20:13:46 -0700 (PDT) MIME-Version: 1.0 From: yudhi s Date: Sat, 14 Sep 2024 08:43:34 +0530 Message-ID: Subject: update faster way To: pgsql-general Content-Type: multipart/alternative; boundary="00000000000054909c06220bbae5" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --00000000000054909c06220bbae5 Content-Type: text/plain; charset="UTF-8" Hello, We have to update a column value(from numbers like '123' to codes like 'abc' by looking into a reference table data) in a partitioned table with billions of rows in it, with each partition having 100's millions rows. As we tested for ~30million rows it's taking ~20minutes to update. So if we go by this calculation, it's going to take days for updating all the values. So my question is 1) If there is any inbuilt way of running the update query in parallel (e.g. using parallel hints etc) to make it run faster? 2) should we run each individual partition in a separate session (e.g. five partitions will have the updates done at same time from 5 different sessions)? And will it have any locking effect or we can just start the sessions and let them run without impacting our live transactions? UPDATE tab_part1 SET column1 = reftab.code FROM reference_tab reftab WHERE tab_part1.column1 = subquery.column1; Regards Yudhi --00000000000054909c06220bbae5 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hello,
We have to update a column value(from numbers li= ke '123' to codes like 'abc' by looking into a reference ta= ble data) in a partitioned table with billions of rows in it, with each par= tition having 100's millions rows. As we tested for ~30million rows it&= #39;s taking ~20minutes to update. So if we go by this calculation, it'= s going to take days for updating all the values. So my question is

= 1) If there is any inbuilt way of running the update query in parallel (e.g= . using parallel hints etc) to make it run faster?
2) should we run each= individual partition in a separate session (e.g. five partitions will have= the updates done at same time from 5 different sessions)? And will it have= any locking effect or we can just=C2=A0start the sessions and let them run= without impacting our live transactions?

UPDATE tab_part1
SET co= lumn1 =3D reftab.code
FROM reference_tab reftab
WHERE tab_part1.colum= n1 =3D subquery.column1;

Regards
Yudhi
--00000000000054909c06220bbae5--