public inbox for [email protected]  
help / color / mirror / Atom feed
From: youness bellasri <[email protected]>
To: Ron Johnson <[email protected]>
Cc: Pgsql-admin <[email protected]>
Subject: Re: Need an idea to operate massive delete operation on big size table.
Date: Wed, 15 Jan 2025 16:27:44 +0100
Message-ID: <CAP44EwknJPH3yzyLX-bHXccZJQHiO=bs5PBBAjaFmK4smGT0DQ@mail.gmail.com> (raw)
In-Reply-To: <CANzqJaCkSt3MkaeVT4wXxLROx3RY4dsArvGTYfn-VX6JxVTBfw@mail.gmail.com>
References: <CAHOGQfWPQhxoyQ5GEUEFwtdXan=krr5Ue0DDPSKJGpcW9Szi4Q@mail.gmail.com>
	<CANzqJaDsDMzwf_YAN3_gXqdUji1GvJg-M4Bn-kW_UT5xy0HB-g@mail.gmail.com>
	<CAP44Ew=Bsk5VLZMDXV0ZbeuwZqBmh6XG4Tg2iKnFoyq_-koxvA@mail.gmail.com>
	<CANzqJaCkSt3MkaeVT4wXxLROx3RY4dsArvGTYfn-VX6JxVTBfw@mail.gmail.com>

You're absolutely right—PostgreSQL doesn't have a DISABLE clause for
indexes or constraints like some other databases (e.g., SQL Server).
However, there are still ways to work around this limitation and optimize
large delete operations in PostgreSQL. Here are some PostgreSQL-specific
strategies:

1. **Batch Deletion**: Delete in small chunks using `LIMIT` or `CTE`.
   ```sql
   DELETE FROM your_table WHERE <condition> LIMIT 10000;
   ```

2. **Drop and Recreate Indexes**: Drop indexes before deletion, then
recreate them.
   ```sql
   DROP INDEX idx_name;
   DELETE FROM your_table WHERE <condition>;
   CREATE INDEX idx_name ON your_table(column_name);
   ```

3. **Partitioning**: Use table partitioning to drop entire partitions
instead of deleting rows.
   ```sql
   DROP TABLE your_table_partition_name;
   ```

4. **Archive and Truncate**: Copy rows to keep into a temp table, truncate
the original, then reinsert.
   ```sql
   CREATE TABLE temp_table AS SELECT * FROM your_table WHERE
<condition_to_keep>;
   TRUNCATE TABLE your_table;
   INSERT INTO your_table SELECT * FROM temp_table;
   DROP TABLE temp_table;
   ```

5. **Disable Triggers**: Temporarily disable triggers during deletion.
   ```sql
   ALTER TABLE your_table DISABLE TRIGGER ALL;
   DELETE FROM your_table WHERE <condition>;
   ALTER TABLE your_table ENABLE TRIGGER ALL;
   ```

6. **Optimize Settings**: Increase `work_mem` and `maintenance_work_mem`
for the session.
   ```sql
   SET work_mem = '1GB';
   SET maintenance_work_mem = '2GB';
   ```

7. **Vacuum and Analyze**: Clean up after deletion.
   ```sql
   VACUUM ANALYZE your_table;
   ```

8. **Soft Deletes**: Use an `is_deleted` flag to mark rows for later
cleanup.

By batching, partitioning, and optimizing settings, you can make large
deletes faster and less disruptive.

Le mer. 15 janv. 2025 à 16:22, Ron Johnson <[email protected]> a
écrit :

> Sadly, Postgresql does not have (super-handy) "DISABLE" clauses.
>
> On Wed, Jan 15, 2025 at 10:12 AM youness bellasri <
> [email protected]> wrote:
>
>> 1. *Batch Deletion*
>>
>> Instead of deleting all records at once, break the operation into smaller
>> batches. This reduces locking, transaction log growth, and the risk of
>> timeouts.
>> 2. *Use Indexes*
>>
>> Ensure that the columns used in the WHERE clause of the delete queries
>> are indexed. This speeds up the identification of rows to delete.
>> 3. *Disable Indexes and Constraints Temporarily*
>>
>> If the table has many indexes or constraints, disabling them during the
>> delete operation can speed up the process. Re-enable them afterward.
>>
>> Le mer. 15 janv. 2025 à 16:08, Ron Johnson <[email protected]> a
>> écrit :
>>
>>> On Wed, Jan 15, 2025 at 9:54 AM Gambhir Singh <[email protected]>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I received a request from a client to delete duplicate records from a
>>>> table which is very large in size.
>>>>
>>>> Delete queries (~2 Billion) are provided via file, and we have to
>>>> execute that file in DB. Last time it lasted for two days. I feel there
>>>> must be another way to delete records in an efficient manner
>>>>
>>>
>>> Maybe the delete "queries" are poorly written.  Maybe there's no
>>> supporting index.
>>>
>>>
> --
> Death to <Redacted>, and butter sauce.
> Don't boil me, I'm still alive.
> <Redacted> lobster!
>


reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected]
  Subject: Re: Need an idea to operate massive delete operation on big size table.
  In-Reply-To: <CAP44EwknJPH3yzyLX-bHXccZJQHiO=bs5PBBAjaFmK4smGT0DQ@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox