public inbox for [email protected]help / color / mirror / Atom feed
Attempting to delete excess rows from table with BATCH DELETE 3+ messages / 3 participants [nested] [flat]
* Attempting to delete excess rows from table with BATCH DELETE @ 2026-01-28 03:22 Gus Spier <[email protected]> 2026-01-28 03:39 ` Re: Attempting to delete excess rows from table with BATCH DELETE David G. Johnston <[email protected]> 2026-01-28 03:57 ` Re: Attempting to delete excess rows from table with BATCH DELETE Ron Johnson <[email protected]> 0 siblings, 2 replies; 3+ messages in thread From: Gus Spier @ 2026-01-28 03:22 UTC (permalink / raw) To: pgsql-general <[email protected]> Environment: AWS RDS Aurora for PostgreSQL 15 hosting tables that support scientific research. The development environment predominantly uses JPA with Hibernate. Years of neglect have allowed mission tables to accumulate hundreds of millions of rows of excess data. The developers and the customer decided we must delete all rows older than 75 days. Table partitioning was briefly considered but discarded because of the effort needed to refactor the codebase. I proposed the straight-forward course of action: delete by batches from the victim tables. The solution seemed obvious: For candidate tables: - Determine timestamp column iwith the most relevant value and call it the Discriminator. - Delete any row whose Discriminator value is 60 days older than the Discriminator, with a LIMIT of 50000, - Get the results of the batch by querying GET DIAGNOSTICS. That value held aside for later use. - COMMIT the delete transaction - Loop to the top and continue deleting batches until there are no more rows older than 60 days. - Before running a test, I ensure that the Discriminator column is indexed I write a procedure to accomplish all this work but it persists in returning a error to the effect that a COMMIT is not valid in a block tht tries to DELETE data. Has anybody seen this before? Is there a solution? Thanks in advance for any help you may be able to offer. Regards, Gus Spier ^ permalink raw reply [nested|flat] 3+ messages in thread
* Re: Attempting to delete excess rows from table with BATCH DELETE 2026-01-28 03:22 Attempting to delete excess rows from table with BATCH DELETE Gus Spier <[email protected]> @ 2026-01-28 03:39 ` David G. Johnston <[email protected]> 1 sibling, 0 replies; 3+ messages in thread From: David G. Johnston @ 2026-01-28 03:39 UTC (permalink / raw) To: Gus Spier <[email protected]>; +Cc: pgsql-general <[email protected]> On Tuesday, January 27, 2026, Gus Spier <[email protected]> wrote: > > I write a procedure to accomplish all this work but it persists in > returning a error to the effect that a COMMIT is not valid in a block > tht tries to DELETE data. > Haven’t tested to be sure but this doesn’t seem like a community edition limitation. I don’t see any mention of this here at least: https://www.postgresql.org/docs/current/xproc.html You’d have to move the logic to a proper client application that executes top-level commands. David J. ^ permalink raw reply [nested|flat] 3+ messages in thread
* Re: Attempting to delete excess rows from table with BATCH DELETE 2026-01-28 03:22 Attempting to delete excess rows from table with BATCH DELETE Gus Spier <[email protected]> @ 2026-01-28 03:57 ` Ron Johnson <[email protected]> 1 sibling, 0 replies; 3+ messages in thread From: Ron Johnson @ 2026-01-28 03:57 UTC (permalink / raw) To: pgsql-general On Tue, Jan 27, 2026 at 10:22 PM Gus Spier <[email protected]> wrote: > Environment: AWS RDS Aurora for PostgreSQL 15 hosting tables that > support scientific research. The development environment predominantly > uses JPA with Hibernate. > > Years of neglect have allowed mission tables to accumulate hundreds of > millions of rows of excess data. The developers and the customer > decided we must delete all rows older than 75 days. Table partitioning > was briefly considered but discarded because of the effort needed to > refactor the codebase. > > I proposed the straight-forward course of action: delete by batches > from the victim tables. > > The solution seemed obvious: > > For candidate tables: > - Determine timestamp column iwith the most relevant value and call it > the Discriminator. > - Delete any row whose Discriminator value is 60 days older than the > Discriminator, with a LIMIT of 50000, > - Get the results of the batch by querying GET DIAGNOSTICS. That value > held aside for later use. > - COMMIT the delete transaction > - Loop to the top and continue deleting batches until there are no > more rows older than 60 days. > > - Before running a test, I ensure that the Discriminator column is indexed > > I write a procedure to accomplish all this work but it persists in > returning a error to the effect that a COMMIT is not valid in a block > tht tries to DELETE data. > > Has anybody seen this before? Yup, when putting the COMMIT in a DO block, which isn't allowed. > Is there a solution? > Loop using bash. In a similar case like this, I first looked for the oldest day's data in the tables, then did something like this bash pseudo-code: StopDate=$(date -d'60 days ago') DeleteDay=$1 export PGHOST=foo.example.com export PGDATABASE=bar while [[ "$DeleteDay <= "$StopDate" ]]; do psql -Xc "DELETE FROM blarge WHERE txn_date > '$DeleteDay' + INTERVAL '1' DAY;" DeleteDay=$(date -d "$DeleteDay + 1 day" +"%Y-%m-%d") done Using that method, I developed a fast and automated monthly archive process which exported and then deleted from 120 tables. -- Death to <Redacted>, and butter sauce. Don't boil me, I'm still alive. <Redacted> lobster! ^ permalink raw reply [nested|flat] 3+ messages in thread
end of thread, other threads:[~2026-01-28 03:57 UTC | newest] Thread overview: 3+ messages (download: mbox mbox.gz follow: Atom feed) -- links below jump to the message on this page -- 2026-01-28 03:22 Attempting to delete excess rows from table with BATCH DELETE Gus Spier <[email protected]> 2026-01-28 03:39 ` David G. Johnston <[email protected]> 2026-01-28 03:57 ` Ron Johnson <[email protected]>
This inbox is served by agora; see mirroring instructions for how to clone and mirror all data and code used for this inbox