Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1spfT0-003Jms-EP for pgsql-general@arkaria.postgresql.org; Sun, 15 Sep 2024 02:58:27 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1spfSz-009plL-Gc for pgsql-general@arkaria.postgresql.org; Sun, 15 Sep 2024 02:58:25 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1spfSy-009plA-SZ for pgsql-general@lists.postgresql.org; Sun, 15 Sep 2024 02:58:24 +0000 Received: from mail-io1-xd32.google.com ([2607:f8b0:4864:20::d32]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1spfSr-001DXl-Qp for pgsql-general@lists.postgresql.org; Sun, 15 Sep 2024 02:58:23 +0000 Received: by mail-io1-xd32.google.com with SMTP id ca18e2360f4ac-82cdb0baf19so211072439f.3 for ; Sat, 14 Sep 2024 19:58:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1726369097; x=1726973897; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=kQvDk53f5CeNn3QtnxJ+uwkMuiY5+S5GIMCN0o+WLwg=; b=Xsx1f0qxYlyiT0GKHh5qodpalsX0N0NtbUMU9e5Q465rcmaf4zhqqiKlsc7CBoYfCS 2Iajb6vuqq80abBRK0SCBFOCMesgB7tiLHHIt9+YJ0tGQumijsCTP0IBKZcL8/g5b3fI 4T2pIA4sQwTBAt4rLveUHw0ocY3uheVyGCyiFc3pUPzUYS75DmSnkA13NMTi6gClgYV2 5ThrYQow/+hLTvhYI1RP1IZzs9HRSm9WJazTXJ/kZYMvfz8nzG+OXr1nwr1Ga1um81LO oLaOos1CYODxdVBkqbv5hgo7NnZAd88HPyitx8ZW1yK+yp5Rl87MhaUrn8KHoimZt0yb Zlxw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726369097; x=1726973897; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=kQvDk53f5CeNn3QtnxJ+uwkMuiY5+S5GIMCN0o+WLwg=; b=TRVBbyZRWxDOM+s7e5PiK1102Ld5lxREQ93ZfRpM+S+yEIttRFlq2xAQufUscxAqtA oLXMgkXpNb6uIsQMcoHVK8wnpZpjbs76coj+tn+gAh71zY1k7c8iluZ1BTmSeXm+UkIc KKNGSa4UAOsZJYvjvkoxj8WK/It+yp+94Z0/8QG8TJpUy2AAELdCBn6K5ZooA//54rqb xRDIQGPzdiWMZQ+BNenPBA09jyZqH/WVoulRpzaVtKzof+xKd1bFSjbDv7ckjprMcu6z TGV1sY1RX0T5kwBaIFPOEHFNn2GokqshzCqmDWv9O2DO+PWq4+Ef4Yf2GhFDlkzTvZQR BPjw== X-Gm-Message-State: AOJu0YzPEZ6rR3aEA5VzybjZ2NByLyFEH3biKP2E3adDYEQO3EeYanRS FdPMDp9YVW7N5aatVRCZrSZ9BkI54PxPcMiq6ey+ezkYgzhXYBR9bf6YxVi1BDQzaPyvOP+RhjX WWFBHPjF09rYF68B8Uy6a1WCwS5dsXaZOHg== X-Google-Smtp-Source: AGHT+IG9ormFrqqVIUs16apVSHNoN/IzjOTUL6mqvBeHYeX0+mrxmZcNWIcJpTxgLbbdyEikC5q4yyMIblxuy7abVN0= X-Received: by 2002:a05:6870:5250:b0:277:eea4:a436 with SMTP id 586e51a60fabf-27c3f0edc61mr6952376fac.7.1726368720579; Sat, 14 Sep 2024 19:52:00 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Juan Rodrigo Alejandro Burgos Mella Date: Sat, 14 Sep 2024 23:51:49 -0300 Message-ID: Subject: Re: update faster way To: yudhi s Cc: pgsql-general Content-Type: multipart/alternative; boundary="0000000000005776ff06221f8a88" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --0000000000005776ff06221f8a88 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable The only way that I see as plausible to use a subquery, both in the query and in the setting of the variable, is that the relationship is one to one, and that there is an index that responds to the predicate UPDATE table1 t1 SET column_value =3D (SELECT FROM table2 t2 WHERE t2.column_relatio= n =3D t1.column_relation) WHERE (colum_relation) IN (SELECT column_relation FROM table2) PD: the index of being in table2 Atte JRBM El s=C3=A1b, 14 sept 2024 a las 0:22, yudhi s () escribi=C3=B3: > Hello, > We have to update a column value(from numbers like '123' to codes like > 'abc' by looking into a reference table data) in a partitioned table with > billions of rows in it, with each partition having 100's millions rows. A= s > we tested for ~30million rows it's taking ~20minutes to update. So if we = go > by this calculation, it's going to take days for updating all the values. > So my question is > > 1) If there is any inbuilt way of running the update query in parallel > (e.g. using parallel hints etc) to make it run faster? > 2) should we run each individual partition in a separate session (e.g. > five partitions will have the updates done at same time from 5 different > sessions)? And will it have any locking effect or we can just start the > sessions and let them run without impacting our live transactions? > > UPDATE tab_part1 > SET column1 =3D reftab.code > FROM reference_tab reftab > WHERE tab_part1.column1 =3D subquery.column1; > > Regards > Yudhi > --0000000000005776ff06221f8a88 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

The only way that I see as plausible to use a subquery= , both in the query and in the setting of the variable, is that the relatio= nship is one to one, and that there is an index that responds to the predic= ate

UPDATE table1 t1
SET column_value =3D (SELECT <value> = FROM table2 t2 WHERE t2.column_relation =3D t1.column_relation)
WHERE (c= olum_relation) IN (SELECT column_relation FROM table2)

PD:=C2=A0the index of being in table2

Att= e
JRBM

El s=C3=A1b, 14 sept 2024 a las 0:22, yudhi s (<learnerdatabase99@gmail.com>) escribi=C3=B3:
Hello,
We have to update a column value(from numb= ers like '123' to codes like 'abc' by looking into a refere= nce table data) in a partitioned table with billions of rows in it, with ea= ch partition having 100's millions rows. As we tested for ~30million ro= ws it's taking ~20minutes to update. So if we go by this calculation, i= t's going to take days for updating all the values. So my question is
1) If there is any inbuilt way of running the update query in paralle= l (e.g. using parallel hints etc) to make it run faster?
2) should we ru= n each individual partition in a separate session (e.g. five partitions wil= l have the updates done at same time from 5 different sessions)? And will i= t have any locking effect or we can just=C2=A0start the sessions and let th= em run without impacting our live transactions?

UPDATE tab_part1
= SET column1 =3D reftab.code
FROM reference_tab reftab
WHERE tab_part1= .column1 =3D subquery.column1;

Regards
Yud= hi
--0000000000005776ff06221f8a88--