MIME-Version: 1.0
References: 
 <CAOC+FBVyTiuOMF92pQG3QOrE9j8qFegvZLoi-3UmdhFNg94e2A@mail.gmail.com>
 <14770c231bf27ea6d22376395ac8f02e41462ed5.camel@cybertec.at>
In-Reply-To: <14770c231bf27ea6d22376395ac8f02e41462ed5.camel@cybertec.at>
From: Wells Oliver <wells.oliver@gmail.com>
Date: Sun, 17 Nov 2024 09:12:05 -0800
Message-ID: 
 <CAOC+FBX1L6hx4DHzH34eaWyhH6NK9ywYSJMMcEJH3rHZfW86+A@mail.gmail.com>
Subject: Re: RDS restore failed due to WAL log and disk space-- any tidy
 fixes?
To: Laurenz Albe <laurenz.albe@cybertec.at>
Cc: pgsql-admin <pgsql-admin@postgresql.org>
Content-Type: multipart/alternative; boundary="0000000000005e173d06271ee839"
Archived-At: 
 <https://www.postgresql.org/message-id/CAOC%2BFBX1L6hx4DHzH34eaWyhH6NK9ywYSJMMcEJH3rHZfW86%2BA%40mail.gmail.com>
Precedence: bulk

--0000000000005e173d06271ee839
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

Interesting. I am migrating a pg_dump archive to a new server, in a single
go. Does it make sense to disable (or speed up?) WAL archiving during the
restore, then reenable it after the restore so a future replica could work?
What would be the steps here? Would disabling or "speeding up" be faster?

max_slot_wal_keep_size is -1 at the moment so I think that's why it kept a
ton of WAL and ran out of space.

On Sun, Nov 17, 2024 at 7:41=E2=80=AFAM Laurenz Albe <laurenz.albe@cybertec=
.at>
wrote:

> On Sat, 2024-11-16 at 16:33 -0800, Wells Oliver wrote:
> > I provisioned an RDS instance with 2500GB space and began the restore o=
f
> a database I know to be about 1750 GB using 16 jobs.
> >
> > Unfortunately, it died very near the end when it ran out of disk space
> due to WAL log usage. Lots of:
> >
> > 2024-11-17 00:07:09 UTC::@:[19861]:PANIC:  could not write to file
> "pg_wal/xlogtemp.19861": No space left on device
> >
> >
> > And then kaboom.
> >
> > I'm wondering what my course of action should be. Can I disable/reduce
> WAL during a restore?
> > wal_level is set to replica, can this temporarily be set to minimal?
> Should I just eat the extra
> > costs to add headroom for the WAL? Would using fewer jobs during a
> restore reduce the amount of WAL
> > created?
>
> If you are using minimal WAL logging and you restore the dump in a single
> transaction, you
> should see way less WAL generated, because data inserted into the table i=
n
> the same transaction
> as the CREATE TABLE statement need not be WAL logged.
>
> But you might more easily solve the problem by speeding up or disabling
> the WAL archiver,
> so that PostgreSQL removes old WAL after the next checkpoint.
>
> Yours,
> Laurenz Albe
>


--=20
Wells Oliver
wells.oliver@gmail.com <wellsoliver@gmail.com>

--0000000000005e173d06271ee839
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div class=3D"gmail_default" style=3D"font-size:small">Int=
eresting. I am migrating a pg_dump archive to a new server,=C2=A0in a singl=
e go. Does it make=C2=A0sense to disable (or speed up?) WAL archiving durin=
g the restore, then reenable it after the restore so a future replica could=
 work? What would be the steps here? Would disabling or &quot;speeding up&q=
uot; be faster?</div><div class=3D"gmail_default" style=3D"font-size:small"=
><br></div><div class=3D"gmail_default" style=3D"font-size:small">max_slot_=
wal_keep_size is -1 at the moment so I think that&#39;s why it kept a ton o=
f WAL and ran out of space.</div></div><br><div class=3D"gmail_quote"><div =
dir=3D"ltr" class=3D"gmail_attr">On Sun, Nov 17, 2024 at 7:41=E2=80=AFAM La=
urenz Albe &lt;<a href=3D"mailto:laurenz.albe@cybertec.at">laurenz.albe@cyb=
ertec.at</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" style=3D=
"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-le=
ft:1ex">On Sat, 2024-11-16 at 16:33 -0800, Wells Oliver wrote:<br>
&gt; I provisioned an RDS instance with 2500GB space and began the restore =
of a database I know to be about 1750 GB using 16 jobs.<br>
&gt; <br>
&gt; Unfortunately, it died very near the end when it ran out of disk space=
 due to WAL log usage. Lots of:<br>
&gt; <br>
&gt; 2024-11-17 00:07:09 UTC::@:[19861]:PANIC:=C2=A0 could not write to fil=
e &quot;pg_wal/xlogtemp.19861&quot;: No space left on device<br>
&gt; <br>
&gt; <br>
&gt; And then kaboom.<br>
&gt; <br>
&gt; I&#39;m wondering what my course of action should be. Can I disable/re=
duce WAL during a restore?<br>
&gt; wal_level is set to replica, can this temporarily be set to minimal? S=
hould I just eat the extra<br>
&gt; costs to add headroom for the WAL? Would using fewer jobs during a res=
tore reduce the amount of WAL<br>
&gt; created?<br>
<br>
If you are using minimal WAL logging and you restore the dump in a single t=
ransaction, you<br>
should see way less WAL generated, because data inserted into the table in =
the same transaction<br>
as the CREATE TABLE statement need not be WAL logged.<br>
<br>
But you might more easily solve the problem by speeding up or disabling the=
 WAL archiver,<br>
so that PostgreSQL removes old WAL after the next checkpoint.<br>
<br>
Yours,<br>
Laurenz Albe<br>
</blockquote></div><div><br clear=3D"all"></div><div><br></div><span class=
=3D"gmail_signature_prefix">-- </span><br><div dir=3D"ltr" class=3D"gmail_s=
ignature"><div dir=3D"ltr"><div>Wells Oliver<br><a href=3D"mailto:wellsoliv=
er@gmail.com" target=3D"_blank">wells.oliver@gmail.com</a></div></div></div=
>

--0000000000005e173d06271ee839--