public inbox for [email protected]  
help / color / mirror / Atom feed
Attempting to delete excess rows from table with BATCH DELETE
3+ messages / 3 participants
[nested] [flat]

* Attempting to delete excess rows from table with BATCH DELETE
@ 2026-01-28 03:22 Gus Spier <[email protected]>
  2026-01-28 03:39 ` Re: Attempting to delete excess rows from table with BATCH DELETE David G. Johnston <[email protected]>
  2026-01-28 03:57 ` Re: Attempting to delete excess rows from table with BATCH DELETE Ron Johnson <[email protected]>
  0 siblings, 2 replies; 3+ messages in thread

From: Gus Spier @ 2026-01-28 03:22 UTC (permalink / raw)
  To: pgsql-general <[email protected]>

Environment: AWS RDS Aurora for PostgreSQL 15 hosting tables that
support scientific research. The development environment predominantly
uses JPA with Hibernate.

Years of neglect have allowed mission tables to accumulate hundreds of
millions of rows of excess data. The developers and the customer
decided we must delete all rows older than 75 days. Table partitioning
was briefly considered but discarded because of the effort needed to
refactor the codebase.

I proposed the straight-forward course of action: delete by batches
from the victim tables.

The solution seemed obvious:

For candidate tables:
- Determine timestamp column iwith the most relevant value and call it
the Discriminator.
- Delete any row whose Discriminator value is 60 days older than the
Discriminator, with a LIMIT of 50000,
- Get the results of the batch by querying GET DIAGNOSTICS. That value
held aside for later use.
- COMMIT the delete transaction
- Loop to the top and continue deleting batches until there are no
more rows older than 60 days.

- Before running a test, I ensure that the Discriminator column is indexed

I write a procedure to accomplish all this work but it persists in
returning a error to the effect that a COMMIT is not valid in a block
tht tries to DELETE data.

Has anybody seen this before? Is there a solution?

Thanks in advance for any help you may be able to offer.

Regards,

Gus Spier






^ permalink  raw  reply  [nested|flat] 3+ messages in thread

* Re: Attempting to delete excess rows from table with BATCH DELETE
  2026-01-28 03:22 Attempting to delete excess rows from table with BATCH DELETE Gus Spier <[email protected]>
@ 2026-01-28 03:39 ` David G. Johnston <[email protected]>
  1 sibling, 0 replies; 3+ messages in thread

From: David G. Johnston @ 2026-01-28 03:39 UTC (permalink / raw)
  To: Gus Spier <[email protected]>; +Cc: pgsql-general <[email protected]>

On Tuesday, January 27, 2026, Gus Spier <[email protected]> wrote:

>
> I write a procedure to accomplish all this work but it persists in
> returning a error to the effect that a COMMIT is not valid in a block
> tht tries to DELETE data.
>

Haven’t tested to be sure but this doesn’t seem like a community edition
limitation. I don’t see any mention of this here at least:

https://www.postgresql.org/docs/current/xproc.html

You’d have to move the logic to a proper client application that executes
top-level commands.

David J.


^ permalink  raw  reply  [nested|flat] 3+ messages in thread

* Re: Attempting to delete excess rows from table with BATCH DELETE
  2026-01-28 03:22 Attempting to delete excess rows from table with BATCH DELETE Gus Spier <[email protected]>
@ 2026-01-28 03:57 ` Ron Johnson <[email protected]>
  1 sibling, 0 replies; 3+ messages in thread

From: Ron Johnson @ 2026-01-28 03:57 UTC (permalink / raw)
  To: pgsql-general

On Tue, Jan 27, 2026 at 10:22 PM Gus Spier <[email protected]> wrote:

> Environment: AWS RDS Aurora for PostgreSQL 15 hosting tables that
> support scientific research. The development environment predominantly
> uses JPA with Hibernate.
>
> Years of neglect have allowed mission tables to accumulate hundreds of
> millions of rows of excess data. The developers and the customer
> decided we must delete all rows older than 75 days. Table partitioning
> was briefly considered but discarded because of the effort needed to
> refactor the codebase.
>
> I proposed the straight-forward course of action: delete by batches
> from the victim tables.
>
> The solution seemed obvious:
>
> For candidate tables:
> - Determine timestamp column iwith the most relevant value and call it
> the Discriminator.
> - Delete any row whose Discriminator value is 60 days older than the
> Discriminator, with a LIMIT of 50000,
> - Get the results of the batch by querying GET DIAGNOSTICS. That value
> held aside for later use.
> - COMMIT the delete transaction
> - Loop to the top and continue deleting batches until there are no
> more rows older than 60 days.
>
> - Before running a test, I ensure that the Discriminator column is indexed
>
> I write a procedure to accomplish all this work but it persists in
> returning a error to the effect that a COMMIT is not valid in a block
> tht tries to DELETE data.
>
> Has anybody seen this before?


Yup, when putting the COMMIT in a DO block, which isn't allowed.


> Is there a solution?
>

Loop using bash.  In a similar case like this, I first looked for the
oldest day's data in the tables, then did something like this bash
pseudo-code:
StopDate=$(date -d'60 days ago')
DeleteDay=$1
export PGHOST=foo.example.com
export PGDATABASE=bar
while [[ "$DeleteDay <= "$StopDate" ]]; do
    psql -Xc "DELETE FROM blarge WHERE txn_date > '$DeleteDay' + INTERVAL
'1' DAY;"
    DeleteDay=$(date -d "$DeleteDay + 1 day" +"%Y-%m-%d")
done

Using that method, I developed a fast and automated monthly archive process
which exported and then deleted from 120 tables.

-- 
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!


^ permalink  raw  reply  [nested|flat] 3+ messages in thread


end of thread, other threads:[~2026-01-28 03:57 UTC | newest]

Thread overview: 3+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2026-01-28 03:22 Attempting to delete excess rows from table with BATCH DELETE Gus Spier <[email protected]>
2026-01-28 03:39 ` David G. Johnston <[email protected]>
2026-01-28 03:57 ` Ron Johnson <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox