Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1twkE5-00HDWG-3U for pgsql-general@arkaria.postgresql.org; Mon, 24 Mar 2025 16:00:33 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1twkE3-005g5i-HL for pgsql-general@arkaria.postgresql.org; Mon, 24 Mar 2025 16:00:31 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1twkE3-005g5a-4Y for pgsql-general@lists.postgresql.org; Mon, 24 Mar 2025 16:00:31 +0000 Received: from mail-oo1-xc36.google.com ([2607:f8b0:4864:20::c36]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1twkE0-000tyJ-1b for pgsql-general@postgresql.org; Mon, 24 Mar 2025 16:00:30 +0000 Received: by mail-oo1-xc36.google.com with SMTP id 006d021491bc7-602513d21fdso283245eaf.0 for ; Mon, 24 Mar 2025 09:00:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1742832026; x=1743436826; darn=postgresql.org; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :from:to:cc:subject:date:message-id:reply-to; bh=Np3gWBW+UYs5XHL0K1ZuqMdWZDtZbW0R1iVX2hAyJhQ=; b=iB3EicPswvoXPjEJ2mP+lPRL0fvjDxDflJMCMdo9tvpTcI4cCKAqdy97SgB8+oAs/i bmNWMz8LMB/PZb7KtCkkcfVx1p9gh8RfIdZQL4DKKWtLdZS2JefB45J2C3CE0BsnkDzy zL2agRR1gqbZeFyJOLmPsF3FQEMYn1K6ZdFfB5NTw6uWcOr5maZkfbiIZ7xI0ZEIWH/1 fyF6pysqv7DgQhJDxALUqsTv9OuabcQ38y6E2xa4TS4J/MOMhMJOXiO0f6xa29RIj4D3 c1LJYxwElhpf7tgBAdQ/4yhlIqZyFoaSE3jVjkvM6xvO/9Yt8kxJteNYVTsCBjSSwQE+ Af/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1742832026; x=1743436826; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Np3gWBW+UYs5XHL0K1ZuqMdWZDtZbW0R1iVX2hAyJhQ=; b=cpS51TX8q/fQtYEJ6PBNwO2vsnM0lOJdGEryyvQshUnPccbLLBEXi9k/pBmtrxrMFE tx3j80IZmDCJjzp1QQL7SeUzdsxuQe2r8CY0hhpi3nCFwtpzQigc+Dfs4EeoICjmhJpA zvUtM1rYwH2x8zHka+fx5ENluxpMsl6c5CpI8TzySPHkhJF1+63tGWF07tvP8kQjmZ2x GFM+y4CJ3gF2dcT4U8XFbpu8trWQYn/0gpEcW2lBGbrj5W6morSXDjNhpB2aRDeLmYjr RrMMZQh/4nuPgBznWGJqI8IFtYkTDZbZgr9psOMjAix1RCa8du7KEYzBd4LVbBoBNUQz N8Ew== X-Gm-Message-State: AOJu0YyRJRnBa3WxuWlONtByUJ01bZZr/j/fawkBXS70j1YzkhgExJKO uXxmNy7hAOHlMJSK4nT8J7NRqG+2Lwhw94r6xH6MrSSbJIO+qFsLrN0vnO/jVhzgjf9I9B/0k8h abrLC6Mz51YMPrnXjFWxaMeHYU/bHiA== X-Gm-Gg: ASbGncusbzRGOEnaCtkdOFTeiQGfGk5aXjK+rXv0HS7+L6oEW39w0YN+iMelsu4FrBx zuFU6ZgwDDe1Mu2J5IPqZq6ZBtj+UzvHERRE39xjdXkD/oPrLIW7Ufx9XlH3lQ7ig2ADhawO1su 8wotZWdz40KU4iP/woOl3GRQviGxzjHW5vyBEhr6oyF/YjDEh4beJ9gjx2gBeyJ1GWLD89QQ== X-Google-Smtp-Source: AGHT+IG5RvtXpD6rY/Gr/AqisaeKtxzcym0SqzQL9W3H6Dv2+71m9B3SRXkKYAUc6/RaVXvqqp4oaJPegL4K5pmcpbM= X-Received: by 2002:a05:6820:a10a:b0:600:239b:c1cd with SMTP id 006d021491bc7-602345ff212mr6637924eaf.6.1742832025956; Mon, 24 Mar 2025 09:00:25 -0700 (PDT) MIME-Version: 1.0 References: <53760c70-4a87-a453-9e02-57abc9cb2e54@gmx.net> <455d28421ae33c73b73a6f527d2f72816ca5dd29.camel@cybertec.at> <5f1ebeda-f080-cb31-75c0-ce2211ea348f@gmx.net> <832c1cdd-c0fe-464b-b4b9-f9d0482b9b78@aklaver.com> <7e990eae-e55c-0d04-1be8-f49bb3251073@gmx.net> In-Reply-To: <7e990eae-e55c-0d04-1be8-f49bb3251073@gmx.net> From: Ron Johnson Date: Mon, 24 Mar 2025 12:00:15 -0400 X-Gm-Features: AQ5f1JpDVgdSJDaj2amvQeap2k52peqQjJn5DnTLuDphnpNqhMXe6dNkLkbW4Xs Message-ID: Subject: Re: Experience and feedback on pg_restore --data-only To: pgsql-general Content-Type: multipart/alternative; boundary="000000000000cf77f4063118b36c" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000cf77f4063118b36c Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Why are you regularly having emergencies requiring the restoration of multi-TB tables to databases with lots of cruft? Fixing that would go a long way towards eliminating your problems with pg_restore. On Mon, Mar 24, 2025 at 11:51=E2=80=AFAM Dimitrios Apostolou wrote: > On Mon, 24 Mar 2025, Adrian Klaver wrote: > > > On 3/24/25 07:24, Dimitrios Apostolou wrote: > >> On Sun, 23 Mar 2025, Laurenz Albe wrote: > >> > >>> On Thu, 2025-03-20 at 23:48 +0100, Dimitrios Apostolou wrote: > >>>> Performance issues: (important as my db size is >5TB) > >>>> > >>>> * WAL writes: I didn't manage to avoid writing to the WAL, despite > >>>> having > >>>> setting wal_level=3Dminimal. I even wrote my own function to ALT= ER > all > >>>> tables to UNLOGGED, but failed with "could not change table T to > >>>> unlogged because it references logged table". I'm out of ideas = on > >>>> this > >>>> one. > >>> > >>> You'd have to create an load the table in the same transaction, that > is, > >>> you'd have to run pg_restore with --single-transaction. > >> > >> That would restore the schema from the dump, while I want to create t= he > >> schema from the SQL code in version control. > > > > > > I am not following, from your original post: > > > > " > > ... create a > > clean database by running the SQL schema definition from version > control, and > > then copy the data for only the tables created. > > > > For this case, I choose to run pg_restore --data-only, and run it as th= e > user > > who owns the database (dbowner), not as a superuser, in order to avoid > > changes being introduced under the radar. > > " > > > > You are running the process in two steps, where the first does not > involve > > pg_restore. Not sure why doing the pg_restore --data-only portion in > single > > transaction is not possible? > > Laurenz informed me that I could avoid writing to the WAL if I "create an= d > load the table in a single transaction". > I haven't tried, but here is what I would do to try --single-transaction: > > Transaction 1: manually issuing all of CREATE TABLE etc. > > Transaction 2: pg_restore --single-transaction --data-only > > The COPY command in transaction 2 would still need to write to WAL, since > it's separate from the CREATE TABLE. > > Am I wrong somewhere? > > >> Something that might work, would be for pg_restore to issue a TRUNCAT= E > >> before the COPY. I believe this would require superuser privelege > though, > >> that I would prefer to avoid. Currently I issue TRUNCATE for all tabl= es > >> manually before running pg_restore, but of course this is in a > different > >> transaction so it doesn't help. > >> > >> By the way do you see potential problems with using > --single-transaction > >> to restore billion-rows tables? > > > > COPY is all or none(version 17+ caveat(see > > https://www.postgresql.org/docs/current/sql-copy.html ON_ERROR)), so > if the > > data dump fails in --single-transaction everything rolls back. > > So if I restore all tables, then an error about a "table not found" would > not roll back already copied tables, since it's not part of a COPY? > > > Thank you for the feedback, > Dimitris > > --=20 Death to , and butter sauce. Don't boil me, I'm still alive. lobster! --000000000000cf77f4063118b36c Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

Why are you regularly havi= ng emergencies requiring the restoration of multi-TB tables to databases wi= th lots of cruft?

Fixing that would go a=C2=A0long= way towards eliminating your problems with pg_restore.

On Mon, Mar 24, 2025 at 11:51=E2=80=AFAM Dimitrios Apostolou <jimis@gmx.net> wrote:
On Mon, 24 Mar 2025, Adrian Klaver w= rote:

> On 3/24/25 07:24, Dimitrios Apostolou wrote:
>>=C2=A0 On Sun, 23 Mar 2025, Laurenz Albe wrote:
>>
>>>=C2=A0 On Thu, 2025-03-20 at 23:48 +0100, Dimitrios Apostolou w= rote:
>>>>=C2=A0 Performance issues: (important as my db size is >= 5TB)
>>>>
>>>>=C2=A0 * WAL writes: I didn't manage to avoid writing t= o the WAL, despite
>>>>=C2=A0 having
>>>>=C2=A0 =C2=A0=C2=A0 setting wal_level=3Dminimal. I even wro= te my own function to ALTER all
>>>>=C2=A0 =C2=A0=C2=A0 tables to UNLOGGED, but failed with &qu= ot;could not change table T to
>>>>=C2=A0 =C2=A0=C2=A0 unlogged because it references logged t= able".=C2=A0 I'm out of ideas on
>>>>=C2=A0 this
>>>>=C2=A0 =C2=A0=C2=A0 one.
>>>
>>>=C2=A0 You'd have to create an load the table in the same t= ransaction, that is,
>>>=C2=A0 you'd have to run pg_restore with --single-transacti= on.
>>
>>=C2=A0 That would restore the schema from the dump, while I want to= create the
>>=C2=A0 schema from the SQL code in version control.
>
>
> I am not following, from your original post:
>
> "
> ... create a
> clean database by running the SQL schema definition from version contr= ol, and
> then copy the data for only the tables created.
>
> For this case, I choose to run pg_restore --data-only, and run it as t= he user
> who owns the database (dbowner), not as a superuser, in order to avoid=
> changes being introduced under the radar.
> "
>
> You are running the process in two steps, where the first does not inv= olve
> pg_restore. Not sure why doing the pg_restore --data-only portion in s= ingle
> transaction is not possible?

Laurenz informed me that I could avoid writing to the WAL if I "create= and
load the table in a single transaction".
I haven't tried, but here is what I would do to try --single-transactio= n:

Transaction 1: manually issuing all of CREATE TABLE etc.

Transaction 2: pg_restore --single-transaction --data-only

The COPY command in transaction 2 would still need to write to WAL, since it's separate from the CREATE TABLE.

Am I wrong somewhere?

>>=C2=A0 Something that might work, would be for pg_restore to issue = a TRUNCATE
>>=C2=A0 before the COPY. I believe this would require superuser priv= elege though,
>>=C2=A0 that I would prefer to avoid. Currently I issue TRUNCATE for= all tables
>>=C2=A0 manually before running pg_restore, but of course this is in= a different
>>=C2=A0 transaction so it doesn't help.
>>
>>=C2=A0 By the way do you see potential problems with using --single= -transaction
>>=C2=A0 to restore billion-rows tables?
>
> COPY is all or none(version 17+ caveat(see
> https://www.postgresql.org/docs/current/s= ql-copy.html=C2=A0 ON_ERROR)), so if the
> data dump fails in --single-transaction everything rolls back.

So if I restore all tables, then an error about a "table not found&quo= t; would
not roll back already copied tables, since it's not part of a COPY?


Thank you for the feedback,
Dimitris



--
Death to <Redacted>, and butter sauce.Don't boil me, I'm still alive.
<Redacted> lobs= ter!
--000000000000cf77f4063118b36c--