public inbox for [email protected]  
help / color / mirror / Atom feed
From: Adrian Klaver <[email protected]>
To: Álvaro Herrera <[email protected]>
To: Dimitrios Apostolou <[email protected]>
Cc: [email protected]
Subject: Re: In-order pg_dump (or in-order COPY TO)
Date: Thu, 4 Sep 2025 07:49:43 -0700
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>

On 9/4/25 05:02, Álvaro Herrera wrote:
> On 2025-Aug-26, Dimitrios Apostolou wrote:
> 
>> I am storing dumps of a database (pg_dump custom format) in a de-duplicating
>> backup server. Each dump is many terabytes in size, so deduplication is very
>> important. And de-duplication itself is based on rolling checksums which is
>> pretty flexible, it can compensate for blocks moving by some offset.
> 
> Hello,
> 
> It's generally considered nowadays that pg_dump is not the best option
> to create backups of very large databases.  You may be better served by
> using a binary backup tool -- something like Barman.  With current
> Postgres releases you can create incremental backups, which would
> probably be more effective at deduplicating than playing with pg_dump's
> TOC, because it's based on what actually happens to the data.  Barman

As I understand it the TOC issue was with pg_restore and it having to 
generate the offsets as they where not included in the backup file as it 
was streamed, not written to a file. The deduplication became an issue 
when changing Postgres versions per:

https://www.postgresql.org/message-id/4ss66r31-558o-qq24-332q-no351p7n5osr%40tzk.arg

"
 > The problem occurs when you do the pg_dump after this restore, correct?


Correct. The first pg_dump from the restored pg17 is not deduplicated at
all. Most of the tables have not changed (logically at least; apparently
they have changed physically).
"


> provides support for hook scripts, which perhaps can be used to transfer
> the backup files to Borg.  (I haven't actually tried to do this, but the
> Barman developers talk about using them to transfer the backups to tape,
> so I imagine getting them to play with Borg it's a Simple Matter of
> Programming.)
> 


-- 
Adrian Klaver
[email protected]






reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected]
  Subject: Re: In-order pg_dump (or in-order COPY TO)
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox