public inbox for [email protected]  
help / color / mirror / Atom feed
From: Adrian Klaver <[email protected]>
To: Dimitrios Apostolou <[email protected]>
To: Laurenz Albe <[email protected]>
Cc: [email protected]
Subject: Re: Experience and feedback on pg_restore --data-only
Date: Mon, 24 Mar 2025 08:31:49 -0700
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
	<[email protected]>
	<[email protected]>

On 3/24/25 07:24, Dimitrios Apostolou wrote:
> On Sun, 23 Mar 2025, Laurenz Albe wrote:
> 
>> On Thu, 2025-03-20 at 23:48 +0100, Dimitrios Apostolou wrote:
>>> Performance issues: (important as my db size is >5TB)
>>>
>>> * WAL writes: I didn't manage to avoid writing to the WAL, despite 
>>> having
>>>    setting wal_level=minimal. I even wrote my own function to ALTER all
>>>    tables to UNLOGGED, but failed with "could not change table T to
>>>    unlogged because it references logged table".  I'm out of ideas on 
>>> this
>>>    one.
>>
>> You'd have to create an load the table in the same transaction, that is,
>> you'd have to run pg_restore with --single-transaction.
> 
> That would restore the schema from the dump, while I want to create the
> schema from the SQL code in version control.


I am not following, from your original post:

"
  ... create a
clean database by running the SQL schema definition from version 
control, and then copy the data for only the tables created.

For this case, I choose to run pg_restore --data-only, and run it as the 
user who owns the database (dbowner), not as a superuser, in order to 
avoid changes being introduced under the radar.
"

You are running the process in two steps, where the first does not 
involve pg_restore. Not sure why doing the pg_restore --data-only 
portion in single transaction is not possible?

> 
> Something that might work, would be for pg_restore to issue a TRUNCATE
> before the COPY. I believe this would require superuser privelege though,
> that I would prefer to avoid. Currently I issue TRUNCATE for all tables
> manually before running pg_restore, but of course this is in a different
> transaction so it doesn't help.
> 
> By the way do you see potential problems with using --single-transaction
> to restore billion-rows tables?

COPY is all or none(version 17+ caveat(see 
https://www.postgresql.org/docs/current/sql-copy.html  ON_ERROR)), so if 
the data dump fails in --single-transaction everything rolls back.

> 
> 
> Thank you,
> Dimitris

-- 
Adrian Klaver
[email protected]







view thread (7+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Experience and feedback on pg_restore --data-only
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox