Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tCj7x-00Be99-PB for pgsql-admin@arkaria.postgresql.org; Sun, 17 Nov 2024 17:32:01 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1tCj7v-000pAX-6D for pgsql-admin@arkaria.postgresql.org; Sun, 17 Nov 2024 17:31:59 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tCj7u-000pAM-NF for pgsql-admin@lists.postgresql.org; Sun, 17 Nov 2024 17:31:59 +0000 Received: from mail-ej1-x62c.google.com ([2a00:1450:4864:20::62c]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1tCj7s-002Nt8-Kg for pgsql-admin@postgresql.org; Sun, 17 Nov 2024 17:31:58 +0000 Received: by mail-ej1-x62c.google.com with SMTP id a640c23a62f3a-a9ed7d8c86cso478233866b.2 for ; Sun, 17 Nov 2024 09:31:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1731864715; x=1732469515; darn=postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=njhrD0XWfHMlmdmCRtMKi3/bw4sIfpYjbnHxanEqUPk=; b=hk0+Z6pyRd0RvmB6Q9alJ9Q+BZlnaR34d09sSdAESSTbNtKKdZBqmncwm+UbDco5tP qqnBrxQ5UYm8EZ1Q+8eEr5pfDcuLTHQknrXB7A3bSLXb1br7aKd3kyrl+jpExuk0ePub 399Xw3ow9bSSD/HR1h+3GALIUcz+Z1e6kd24KTfR5C7ckV1JmaO5L222LrnDptPPhlNw JonOWBREdkVi0ubACE/zK+anH+wwZob9m9p8ip4waQMlj81YVGaLAax+K10QJCnuOmKo sSwk0syRjskx0exjKJhl4H51CNFINU/RCTLRghoxyyBYLFZBN9dw4UYf6R2+iUonXres WFpw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731864715; x=1732469515; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=njhrD0XWfHMlmdmCRtMKi3/bw4sIfpYjbnHxanEqUPk=; b=raes66qtLuUN3wCBwDP7h45aIObbFxHQjrbI+FJ3A4cGpAMKgfe5H/XqoVD7Zoayws IwBLOyIq+qz36XSi91pScRTSZBt5ycDWglJV1gJO7X6bmMb9Z8rDiMBqkYOZ4CC41Qt0 iOwoCWDnZ4oVHaGpAoYcwq9cGeMWhIe+iGL/XorHfYqONPT+BfklGL9DUjyhl+w344K0 LG3Uu+7lSpivEkuiZuKFZ1iU31CTYED0sKonUsXEb1Eok/swMJULU0GCTRef/xeb/Vl6 6RbE4L1buCZlTxQcOjU/9xIXiBO0/zRq4Y/vhel5yuvM8Hq6Qgu8JVJgeN/gScmCHY1D iNIQ== X-Gm-Message-State: AOJu0YyU+60OuJGO/wH3uk9dCyd1RGr5UgRXh5hKQNOnA5Fc0kcQLD6s DgZ7jTaPg0IRpHKAaVSYH4dQPUvYSaTf0w8LyPpSWtsIcx4vV+aDJgfLl4ktSDwYv/fXFTI+3M6 6RmwKxFvhKzQ71RwPv2N2bX5Jjf0K8yPC X-Google-Smtp-Source: AGHT+IHuhv1T/8yaBrVtU7coAD5d/f0X/KukRV5FkzJkb4eF/gfWn4loIbUofFQFYljhLpQ7TOLFnZVS6QEdfdWMgB8= X-Received: by 2002:a17:906:dac7:b0:aa4:777d:739a with SMTP id a640c23a62f3a-aa4833e9fcbmr928818066b.8.1731864715084; Sun, 17 Nov 2024 09:31:55 -0800 (PST) MIME-Version: 1.0 References: <14770c231bf27ea6d22376395ac8f02e41462ed5.camel@cybertec.at> In-Reply-To: From: Wells Oliver Date: Sun, 17 Nov 2024 09:31:19 -0800 Message-ID: Subject: Re: RDS restore failed due to WAL log and disk space-- any tidy fixes? To: Ron Johnson Cc: pgsql-admin Content-Type: multipart/alternative; boundary="0000000000002448f906271f2d3e" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --0000000000002448f906271f2d3e Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Actually, in RDS it seems you cannot set archive_mode either. On Sun, Nov 17, 2024 at 9:23=E2=80=AFAM Wells Oliver wrote: > It does. I think it uses WAL behind the scenes. In RDS unfortunately > cannot set wal_level, but you can set archive_mode. > > On Sun, Nov 17, 2024 at 9:21=E2=80=AFAM Ron Johnson > wrote: > >> Doesn't RDS have its own replication? >> >> Anyway, for pg_restore, I'd absolutely set archive_mode=3Doff >> and wal_level=3Dminimal, then set them to their production values when i= t's >> finished. >> >> On Sun, Nov 17, 2024 at 12:12=E2=80=AFPM Wells Oliver >> wrote: >> >>> Interesting. I am migrating a pg_dump archive to a new server, in a >>> single go. Does it make sense to disable (or speed up?) WAL archiving >>> during the restore, then reenable it after the restore so a future repl= ica >>> could work? What would be the steps here? Would disabling or "speeding = up" >>> be faster? >>> >>> max_slot_wal_keep_size is -1 at the moment so I think that's why it kep= t >>> a ton of WAL and ran out of space. >>> >>> On Sun, Nov 17, 2024 at 7:41=E2=80=AFAM Laurenz Albe >>> wrote: >>> >>>> On Sat, 2024-11-16 at 16:33 -0800, Wells Oliver wrote: >>>> > I provisioned an RDS instance with 2500GB space and began the restor= e >>>> of a database I know to be about 1750 GB using 16 jobs. >>>> > >>>> > Unfortunately, it died very near the end when it ran out of disk >>>> space due to WAL log usage. Lots of: >>>> > >>>> > 2024-11-17 00:07:09 UTC::@:[19861]:PANIC: could not write to file >>>> "pg_wal/xlogtemp.19861": No space left on device >>>> > >>>> > >>>> > And then kaboom. >>>> > >>>> > I'm wondering what my course of action should be. Can I >>>> disable/reduce WAL during a restore? >>>> > wal_level is set to replica, can this temporarily be set to minimal? >>>> Should I just eat the extra >>>> > costs to add headroom for the WAL? Would using fewer jobs during a >>>> restore reduce the amount of WAL >>>> > created? >>>> >>>> If you are using minimal WAL logging and you restore the dump in a >>>> single transaction, you >>>> should see way less WAL generated, because data inserted into the tabl= e >>>> in the same transaction >>>> as the CREATE TABLE statement need not be WAL logged. >>>> >>>> But you might more easily solve the problem by speeding up or disablin= g >>>> the WAL archiver, >>>> so that PostgreSQL removes old WAL after the next checkpoint. >>>> >>>> Yours, >>>> Laurenz Albe >>>> >>> >>> >>> -- >>> Wells Oliver >>> wells.oliver@gmail.com >>> >> >> >> -- >> Death to , and butter sauce. >> Don't boil me, I'm still alive. >> lobster! >> > > > -- > Wells Oliver > wells.oliver@gmail.com > --=20 Wells Oliver wells.oliver@gmail.com --0000000000002448f906271f2d3e Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Act= ually, in RDS it seems you cannot set=C2=A0archive_mode either.
=
On Sun= , Nov 17, 2024 at 9:23=E2=80=AFAM Wells Oliver <wells.oliver@gmail.com> wrote:
It does. I think it uses WAL behind th= e scenes. In RDS unfortunately cannot set wal_level, but you can set archiv= e_mode.

On Sun, Nov 17, 2024 at 9:21=E2=80=AFAM Ron Johnson <ronljohnsonjr@gmail.= com> wrote:
Doesn't RDS have its own replication?

Anyway, for pg_restore, I'd absolutely set=C2=A0archiv= e_mode=3Doff and=C2=A0wal_level=3Dminimal, then set them to their productio= n values when it's finished.

On Su= n, Nov 17, 2024 at 12:12=E2=80=AFPM Wells Oliver <wells.oliver@gmail.com> wrote:=
Interesting. I = am migrating a pg_dump archive to a new server,=C2=A0in a single go. Does i= t make=C2=A0sense to disable (or speed up?) WAL archiving during the restor= e, then reenable it after the restore so a future replica could work? What = would be the steps here? Would disabling or "speeding up" be fast= er?

max_slot_wal_keep_size is -1 at the moment so I think that's why = it kept a ton of WAL and ran out of space.

On Sun, Nov 17, 2024 at 7:4= 1=E2=80=AFAM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Sat, 2024-11-16 at 16:33 -0800,= Wells Oliver wrote:
> I provisioned an RDS instance with 2500GB space and began the restore = of a database I know to be about 1750 GB using 16 jobs.
>
> Unfortunately, it died very near the end when it ran out of disk space= due to WAL log usage. Lots of:
>
> 2024-11-17 00:07:09 UTC::@:[19861]:PANIC:=C2=A0 could not write to fil= e "pg_wal/xlogtemp.19861": No space left on device
>
>
> And then kaboom.
>
> I'm wondering what my course of action should be. Can I disable/re= duce WAL during a restore?
> wal_level is set to replica, can this temporarily be set to minimal? S= hould I just eat the extra
> costs to add headroom for the WAL? Would using fewer jobs during a res= tore reduce the amount of WAL
> created?

If you are using minimal WAL logging and you restore the dump in a single t= ransaction, you
should see way less WAL generated, because data inserted into the table in = the same transaction
as the CREATE TABLE statement need not be WAL logged.

But you might more easily solve the problem by speeding up or disabling the= WAL archiver,
so that PostgreSQL removes old WAL after the next checkpoint.

Yours,
Laurenz Albe


--


--
Death to <Redacted>, and butter sauce.Don't boil me, I'm still alive.
<Redacted> lobs= ter!


--


--
--0000000000002448f906271f2d3e--