Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1spQtg-001Soq-Jl for pgsql-general@arkaria.postgresql.org; Sat, 14 Sep 2024 11:25:01 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1spQtf-00Apuq-EH for pgsql-general@arkaria.postgresql.org; Sat, 14 Sep 2024 11:24:59 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1spQtf-00Apui-2P for pgsql-general@lists.postgresql.org; Sat, 14 Sep 2024 11:24:59 +0000 Received: from mail.hjp.at ([212.17.106.138] helo=rorschach.hjp.at) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1spQtX-00193u-Bh for pgsql-general@lists.postgresql.org; Sat, 14 Sep 2024 11:24:58 +0000 Received: by rorschach.hjp.at (Postfix, from userid 1000) id BDC201ACB4; Sat, 14 Sep 2024 13:24:51 +0200 (CEST) Date: Sat, 14 Sep 2024 13:24:51 +0200 From: "Peter J. Holzer" To: pgsql-general@lists.postgresql.org Subject: Re: update faster way Message-ID: <20240914112451.bgxnbjv5b6unoijc@hjp.at> Mail-Followup-To: pgsql-general@lists.postgresql.org References: <1955d0e6cafd643520d282a74d9956340983074e.camel@cybertec.at> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="hsrlepbrqwkv4lby" Content-Disposition: inline In-Reply-To: List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --hsrlepbrqwkv4lby Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 2024-09-14 16:10:15 +0530, yudhi s wrote: > On Sat, 14 Sept, 2024, 1:09 pm Laurenz Albe, w= rote: >=20 > On Sat, 2024-09-14 at 08:43 +0530, yudhi s wrote: > > We have to update a column value(from numbers like '123' to codes l= ike > 'abc' > > by looking into a reference table data) in a partitioned table with > billions > > of rows in it, with each partition having 100's millions rows. As we > tested >=20 > > for ~30million rows it's taking ~20minutes to update. [...] > > 2) should we run each individual partition in a separate session (e= =2Eg. > five > >=A0 =A0 partitions will have the updates done at same time from 5 di= fferent > >=A0 =A0 sessions)? And will it have any locking effect or we can jus= t=A0start > the > >=A0 =A0 sessions and let them run without impacting our live transac= tions? >=20 > Option 2 is possible, and you can even have more than one session wor= kingr > on a single partition. >=20 > However, the strain on your system's resources and particularly the r= ow > locks will impair normal database work. >=20 > Essentially, you can either take an extended down time or perform the > updates > in very small chunks with a very low "lock_timeout" over a very long = period > of time.=A0 If any of the batches fails because of locking conflicts,= it has > to be retried. >=20 > Investigate with EXPLAIN (ANALYZE) why the updates take that long.=A0= It > could > be a lame disk, tons of (unnecessary?) indexes or triggers, but it mi= ght as > well be the join with the lookup table, so perhaps there is room for > improvement (more "work_mem" for a hash join?) >=20 >=20 >=20 > Thank you so much Laurenz. >=20 > We have mostly insert/update happen on current day/live partition. So > considering that, if we will run batch updates(with batch size of 1000) f= rom > five different sessions in parallel on different historical partition, at= any > time they will lock 5000 rows and then commit. If you are updating billions of rows in batches of 5000, that means you are executing hundreds of thousands or millions of update statements. Which in turn means that you want as little overhead as possible per batch which means finding those 5000 rows should be quick. Which brings us back to Igor's question: Do you have any indexes in place which speed up finding those 5000 rows (the primary key almost certainly won't help with that). EXPLAIN (ANALYZE) (as suggested by Laurenz) will certainly help answering that question. > And also those rows will not collide with each other. So do you think > that approach can anyway cause locking issues? No, I don't think so. With a batch size that small I wouldn't expect problems even on the live partition. But of course many busy parallel sessions will put additional load on the system which may or may not be noticeable by users (you might saturate the disks writing WAL entries for example, which would slow down other sessions trying to commit). > Regarding batch update with batch size of 1000, do we have any method exi= sts in > postgres (say like forall statement in Oracle) which will do the batch dm= l. Can > you please guide me here, how we can do it in postgres. Postgres offers several server side languages. As an Oracle admin you will probably find PL/pgSQL most familiar. But you could also use Perl or Python or several others. And of course you could use any programming/scripting language you like on the client side. hp --=20 _ | Peter J. Holzer | Story must make more sense than reality. |_|_) | | | | | hjp@hjp.at | -- Charles Stross, "Creative writing __/ | http://www.hjp.at/ | challenge!" --hsrlepbrqwkv4lby Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEETtJbRjyPwVTYGJ5k8g5IURL+KF0FAmblcn0ACgkQ8g5IURL+ KF2hWQ//QOECmhukYit2/ZbsyVkiwNVa7aMSfXmkaUHNShKikHyOa/z9C/dJfoF+ 176gX1Ay++J2LPdakzBW90aGw82vRtnh5tfWDSbe7qd/SEIq4IRpVpNRO4Bh6p/K bBelC6IWpAZFpQWq/YJGdWdO4BXPzv+aeOUTdDIjetpLjLaRcI+IWue2UJQQxPcp W+vNLxnGmgUEptQy9vqrpnT5XkS1EcDv1xUFggA+NoMWuP+BP8Go9oHc6kqlVLDf FaZEobCSNA3kjGHM8xgCA3FStOvHg9FFA87HPwliYyDbRWNrf8AO5mu1dDnhtEIy OqEd9sJuNZqfdHRcs+fPZh3sEM/RZKAbtUSz2vJSYBVfmUHI8GPCQ9WbcW22lR6g Xhfa96ni2XAlkHWTngQCB0qNRdSpgc4ClLyJZCbsCK0/WDaUXW7S5XeFen/wUsnR RyPpDYcM6vG54vnRCKzXWqr/sl36mhBlzRhJMyEC2HrVzsu/B1+NTkH5pM9JhhTh /PdCz35BTaol2pgxoqFukcfS34RCdHaNBNX9ZtGIH+PJdqhp1DLVCQkBSkS1VjZh 3Nr9IeHwaQAkVHlFC5ERUgaxUaFbehgYLvQfuw09jkhIrFMxjYnUy4reGPQYLbX6 oDNza75sEK33GbC5F6UyYdpEhtLQK/8enoNv78qfCPfPSFwsdwQ= =8tcD -----END PGP SIGNATURE----- --hsrlepbrqwkv4lby--