public inbox for [email protected]
help / color / mirror / Atom feedFrom: Muhammad Usman Khan <[email protected]>
To: yudhi s <[email protected]>
Cc: pgsql-general <[email protected]>
Subject: Re: update faster way
Date: Tue, 17 Sep 2024 10:36:50 +0500
Message-ID: <CAPnRvGs0VBL=2Huz4qChHdh+wKO-CuimrKzZt3xFiRHxkAc5vA@mail.gmail.com> (raw)
In-Reply-To: <CAEzWdqd-22B-bpVdT3yzegiOig9zvJnfJvi=GOMFfHT-Jg8CgQ@mail.gmail.com>
References: <CAEzWdqd-22B-bpVdT3yzegiOig9zvJnfJvi=GOMFfHT-Jg8CgQ@mail.gmail.com>
Hi,
You can solve this problem using Citus in PostgreSQL, which is specifically
designed for parallelism
SELECT create_distributed_table('tab_part1', 'partition_key');
SELECT create_distributed_table('reference_tab', 'reference_key');
UPDATE tab_part1
SET column1 = reftab.code
FROM reference_tab reftab
WHERE tab_part1.column1 = reftab.column1;
On Sat, 14 Sept 2024 at 08:22, yudhi s <[email protected]> wrote:
> Hello,
> We have to update a column value(from numbers like '123' to codes like
> 'abc' by looking into a reference table data) in a partitioned table with
> billions of rows in it, with each partition having 100's millions rows. As
> we tested for ~30million rows it's taking ~20minutes to update. So if we go
> by this calculation, it's going to take days for updating all the values.
> So my question is
>
> 1) If there is any inbuilt way of running the update query in parallel
> (e.g. using parallel hints etc) to make it run faster?
> 2) should we run each individual partition in a separate session (e.g.
> five partitions will have the updates done at same time from 5 different
> sessions)? And will it have any locking effect or we can just start the
> sessions and let them run without impacting our live transactions?
>
> UPDATE tab_part1
> SET column1 = reftab.code
> FROM reference_tab reftab
> WHERE tab_part1.column1 = subquery.column1;
>
> Regards
> Yudhi
>
view thread (4+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected]
Subject: Re: update faster way
In-Reply-To: <CAPnRvGs0VBL=2Huz4qChHdh+wKO-CuimrKzZt3xFiRHxkAc5vA@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox