public inbox for [email protected]  
help / color / mirror / Atom feed
From: Murthy Nunna <[email protected]>
To: Pgsql-admin <[email protected]>
Subject: RE: Running rsync backups in pg15
Date: Thu, 7 Nov 2024 20:58:43 +0000
Message-ID: <DM8PR09MB6677705AE81436745B4487F6B85C2@DM8PR09MB6677.namprd09.prod.outlook.com> (raw)
In-Reply-To: <01000193082d1077-34d9461d-49e4-44ef-b83a-0201df4fc0ba-000000@email.amazonses.com>
References: <CANzqJaCr-id_4YotukPeH=rHCELS58Z63FoEbkNFBSqzjGMueQ@mail.gmail.com>
	<01000193082d1077-34d9461d-49e4-44ef-b83a-0201df4fc0ba-000000@email.amazonses.com>

rsync  backups are not just incremental backups. They are incremental merged backups. You always get full backup at the end of the backup.

My database is 18TB. The very first backup took 14 hours. We just keep overwriting this full backup with daily rsyncs. The daily rsyncs take about 15 to 30 minutes depending up on the activity since last backup.

I feel the new way (pg_backup_start) is a step backward. I did not really see any issue with the old way. When I crashed my cluster (pg14), it is nicely removing the backup label if it exists at the restart. I think that is good enough.



From: Doug Reynolds <[email protected]>
Sent: Thursday, November 7, 2024 1:50 PM
To: Pgsql-admin <[email protected]>
Subject: Re: Running rsync backups in pg15


[EXTERNAL] – This message is from an external sender
I would be curious to see how long it would take to restore a 5TB database from each WAL ever created on the system.

We used to do a similar process with Oracle on a 30TB database years ago, but it literally took 26-28 hours to do a full backup.

Doug


On Nov 7, 2024, at 1:05 PM, Ron Johnson <[email protected]<mailto:[email protected]>> wrote:

On Thu, Nov 7, 2024 at 12:47 PM Evan Rempel <[email protected]<mailto:[email protected]>> wrote:
We use a similar approach, but instead of using rsync, we use our backup software directly which is an incremental forever tool. Allows backup of TB DBs in short minutes. Switching to pgbackrest is actually a step backwards for us.

Last night's pgbackrest incremental backup of a 5.1TB database took a whopping 92 seconds.  How's that a backwards step?

Sure, the weekly full backup takes 84 minutes, but that's in so way shape or form painfully slow.

But as the OP states, if you have to keep the postgresql session open for the pg_start_backup and the pg_stop_backup then we will have to do a significant architectual change.

Anyone know if there is a straight forward way to allows the pg_start_backup and the pg_stop_backup to be run in different sessions?


--
Evan

________________________________
From: Ron Johnson <[email protected]<mailto:[email protected]>>
Sent: November 7, 2024 9:34 AM
To: [email protected]<mailto:[email protected]> <[email protected]<mailto:[email protected]>>
Subject: Re: Running rsync backups in pg15

On Thu, Nov 7, 2024 at 11:35 AM Murthy Nunna <[email protected]<mailto:[email protected]>> wrote:

Hi,



In PG14 and earlier, there is no requirement to keep database connection while rsync is in progress. However, there is a change in PG15+ that requires rsync to be while we have the same database session open that executes SELECT pg_backup_start('label'). This change requires a rewrite of existing scripts we have.



Currently (pg14):



                In bash script (run from cron)
1.       psql Select pg_start_backup
2.       rsync
3.       psql Select pg_stop_backup



In pg15 and later:



In bash script (run from cron)



psql

Select pg_start_backup

! run-rsync-script

Select pg_stop_backup



It can be done, but it makes it ugly to check errors and so forth that occur in the rsync script.



Anybody found an elegant way of doing this?

Run pgbackrest instead of rsync,

--
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!


--
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!


view thread (24+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected]
  Subject: RE: Running rsync backups in pg15
  In-Reply-To: <DM8PR09MB6677705AE81436745B4487F6B85C2@DM8PR09MB6677.namprd09.prod.outlook.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox