public inbox for [email protected]  
help / color / mirror / Atom feed
From: Moreno Andreo <[email protected]>
To: PostgreSQL mailing lists <[email protected]>
Subject: Logical replication, need to reclaim big disk space
Date: Fri, 16 May 2025 17:45:59 +0200
Message-ID: <[email protected]> (raw)

Hi,
     we are moving our old binary data approach, moving them from bytea 
fields in a table to external storage (making database smaller and 
related operations faster and smarter).
In short, we have a job that runs in background and copies data from the 
table to an external file and then sets the bytea field to NULL.
(UPDATE tbl SET blob = NULL, ref = 'path/to/file' WHERE id = <uuid>)

This results, at the end of the operations, to a table that's less than 
one tenth in size.
We have a multi-tenant architecture (100s of schemas with identical 
architecture, all inheriting from public) and we are performing the task 
on one table per schema.

The problem is: this is generating BIG table bloat, as you may imagine.
Running a VACUUM FULL on an ex-22GB table on a standalone test server is 
almost immediate.
If I had only one server, I'll process a table a time, with a nightly 
script, and issue a VACUUM FULL to tables that have already been processed.

But I'm in a logical replication architecture (we are using a 
multimaster system called pgEdge, but I don't think it will make big 
difference, since it's based on logical replication), and I'm building a 
test cluster.

I've been instructed to issue VACUUM FULL on both nodes, nightly, but 
before proceeding I read on docs that VACUUM FULL can disrupt logical 
replication, so I'm a bit concerned on how to proceed. Rows are cleared 
one a time (one transaction, one row, to keep errors to the record that 
issued them)

I read about extensions like pg_squeeze, but I wonder if they are still 
not dangerous for replication.

Thanks for your help.
Moreno.-







view thread (2+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected]
  Subject: Re: Logical replication, need to reclaim big disk space
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox