Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tCj0A-00BaMj-Vj for pgsql-admin@arkaria.postgresql.org; Sun, 17 Nov 2024 17:23:58 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1tCj08-000j4v-CB for pgsql-admin@arkaria.postgresql.org; Sun, 17 Nov 2024 17:23:56 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tCj07-000j4l-Sv for pgsql-admin@lists.postgresql.org; Sun, 17 Nov 2024 17:23:56 +0000 Received: from mail-ej1-x62c.google.com ([2a00:1450:4864:20::62c]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1tCj05-002Nps-S5 for pgsql-admin@postgresql.org; Sun, 17 Nov 2024 17:23:55 +0000 Received: by mail-ej1-x62c.google.com with SMTP id a640c23a62f3a-aa1e51ce601so263387266b.3 for ; Sun, 17 Nov 2024 09:23:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1731864232; x=1732469032; darn=postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=GowvOHAhBk/EUP4UNxGIdl8oEJeVAmv/eSIie+ckRZQ=; b=CEirH/Z6p8jm2ImwM2V2HZxLhK9/FGwOSV85mEvnzWtmRjiNERIf4z2c7ETolIqHlC NYjkeOa9XSHF10zXOc9/225O3zp/Ha3MSQ1TGtQgQKhasx85EQhiO/pxtyDaLA9EI28J vFPspsn/vdhl+zj209tu+Yp4/8err+qepQsvML4HC0wYRmzMfI8a8ihGmu+JB+b/ZaUp hJL9jy3KNiIpu0daOUtKZfkgtzTJoCdnQizLipYlKw5Tx4wt0v77UmLEO/jK8g2E0vMS vdwNnFR6y8VgV/gkBAelQSk3kATBleE5tHJ5PUzfTVJjO31JdRkhdH8Y2cGmssbD5efa QNYQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731864232; x=1732469032; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=GowvOHAhBk/EUP4UNxGIdl8oEJeVAmv/eSIie+ckRZQ=; b=bhIF1Px/RycoIawCtRlQXglEFq0LKUa/sp2VPKS4Fnpao/Xy5/31lAOpgI8J1kJqie +YNei9na78cW2aNw8OadM0T6PAsXLOvlkFJgYH9qiCc6i6tVyWuHnsKio6IHFQWDodql uKSWU8o7mOXD3uooOyGLb+YMIbCmBqsIjegx78dhH8oj+76scvHldyKScNYCjzyCJUur SVfpsFSDnB8vupdNaOc1DsRz/aZDgj20jPmgYCsUzG8ycVv3hz7CYsfhUTgQk9iVRIXF 19rW8+jCa9AIM7Qa8cng/3zq+OJd5zhOfeUq8OfUQYZ2hixNKPtYKPCBoAJFV9Nu40dj eqdQ== X-Gm-Message-State: AOJu0YyBN5/x6r0L1v6yJ4elZCPioXjWGRhKQOv+fMjXMHeEudC/lQ2y oWXkrcFVkJ8V+svwrjAHwIpxobdjYkm3xyE715GM0fw6i7BWeZGX28xvWVQEoZeOp6mieZVuJ3T nBHE8kAHS+8+a6dW3cPQtLsDh6nl/PA== X-Google-Smtp-Source: AGHT+IH0DfWB9faO/c4aZPoIumko2S0C4BzsBHG8q/NMV4eX8vtY0JZnMEQdYKDyxsee1CSiVUKLkYpRrsXAnGhTHNQ= X-Received: by 2002:a17:907:3e0f:b0:a7a:9f0f:ab18 with SMTP id a640c23a62f3a-aa48341caa6mr914563866b.20.1731864231748; Sun, 17 Nov 2024 09:23:51 -0800 (PST) MIME-Version: 1.0 References: <14770c231bf27ea6d22376395ac8f02e41462ed5.camel@cybertec.at> In-Reply-To: From: Wells Oliver Date: Sun, 17 Nov 2024 09:23:15 -0800 Message-ID: Subject: Re: RDS restore failed due to WAL log and disk space-- any tidy fixes? To: Ron Johnson Cc: pgsql-admin Content-Type: multipart/alternative; boundary="000000000000552c5206271f1000" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000552c5206271f1000 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable It does. I think it uses WAL behind the scenes. In RDS unfortunately cannot set wal_level, but you can set archive_mode. On Sun, Nov 17, 2024 at 9:21=E2=80=AFAM Ron Johnson wrote: > Doesn't RDS have its own replication? > > Anyway, for pg_restore, I'd absolutely set archive_mode=3Doff > and wal_level=3Dminimal, then set them to their production values when it= 's > finished. > > On Sun, Nov 17, 2024 at 12:12=E2=80=AFPM Wells Oliver > wrote: > >> Interesting. I am migrating a pg_dump archive to a new server, in a >> single go. Does it make sense to disable (or speed up?) WAL archiving >> during the restore, then reenable it after the restore so a future repli= ca >> could work? What would be the steps here? Would disabling or "speeding u= p" >> be faster? >> >> max_slot_wal_keep_size is -1 at the moment so I think that's why it kept >> a ton of WAL and ran out of space. >> >> On Sun, Nov 17, 2024 at 7:41=E2=80=AFAM Laurenz Albe >> wrote: >> >>> On Sat, 2024-11-16 at 16:33 -0800, Wells Oliver wrote: >>> > I provisioned an RDS instance with 2500GB space and began the restore >>> of a database I know to be about 1750 GB using 16 jobs. >>> > >>> > Unfortunately, it died very near the end when it ran out of disk spac= e >>> due to WAL log usage. Lots of: >>> > >>> > 2024-11-17 00:07:09 UTC::@:[19861]:PANIC: could not write to file >>> "pg_wal/xlogtemp.19861": No space left on device >>> > >>> > >>> > And then kaboom. >>> > >>> > I'm wondering what my course of action should be. Can I disable/reduc= e >>> WAL during a restore? >>> > wal_level is set to replica, can this temporarily be set to minimal? >>> Should I just eat the extra >>> > costs to add headroom for the WAL? Would using fewer jobs during a >>> restore reduce the amount of WAL >>> > created? >>> >>> If you are using minimal WAL logging and you restore the dump in a >>> single transaction, you >>> should see way less WAL generated, because data inserted into the table >>> in the same transaction >>> as the CREATE TABLE statement need not be WAL logged. >>> >>> But you might more easily solve the problem by speeding up or disabling >>> the WAL archiver, >>> so that PostgreSQL removes old WAL after the next checkpoint. >>> >>> Yours, >>> Laurenz Albe >>> >> >> >> -- >> Wells Oliver >> wells.oliver@gmail.com >> > > > -- > Death to , and butter sauce. > Don't boil me, I'm still alive. > lobster! > --=20 Wells Oliver wells.oliver@gmail.com --000000000000552c5206271f1000 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
It = does. I think it uses WAL behind the scenes. In RDS unfortunately cannot se= t wal_level, but you can set archive_mode.

On Sun, Nov 17, 2024 at 9:2= 1=E2=80=AFAM Ron Johnson <ron= ljohnsonjr@gmail.com> wrote:
Doesn't RDS have its own repl= ication?

Anyway, for pg_restore, I'd absolutel= y set=C2=A0archive_mode=3Doff and=C2=A0wal_level=3Dminimal, then set them t= o their production values when it's finished.

On Sun, Nov 17, 2024 at 12:12=E2=80=AFPM Wells Oliver <wells.oliver@gmail.c= om> wrote:
Interesting. I am migrating a pg_dump archive to a new server,=C2=A0in a = single go. Does it make=C2=A0sense to disable (or speed up?) WAL archiving = during the restore, then reenable it after the restore so a future replica = could work? What would be the steps here? Would disabling or "speeding= up" be faster?

max_slot_wal_keep_size is -1 at the moment so I thin= k that's why it kept a ton of WAL and ran out of space.

=
On Sun, No= v 17, 2024 at 7:41=E2=80=AFAM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Sat, 2024-11-1= 6 at 16:33 -0800, Wells Oliver wrote:
> I provisioned an RDS instance with 2500GB space and began the restore = of a database I know to be about 1750 GB using 16 jobs.
>
> Unfortunately, it died very near the end when it ran out of disk space= due to WAL log usage. Lots of:
>
> 2024-11-17 00:07:09 UTC::@:[19861]:PANIC:=C2=A0 could not write to fil= e "pg_wal/xlogtemp.19861": No space left on device
>
>
> And then kaboom.
>
> I'm wondering what my course of action should be. Can I disable/re= duce WAL during a restore?
> wal_level is set to replica, can this temporarily be set to minimal? S= hould I just eat the extra
> costs to add headroom for the WAL? Would using fewer jobs during a res= tore reduce the amount of WAL
> created?

If you are using minimal WAL logging and you restore the dump in a single t= ransaction, you
should see way less WAL generated, because data inserted into the table in = the same transaction
as the CREATE TABLE statement need not be WAL logged.

But you might more easily solve the problem by speeding up or disabling the= WAL archiver,
so that PostgreSQL removes old WAL after the next checkpoint.

Yours,
Laurenz Albe


--


--
Death to <Redacted>, and butter sauce.Don't boil me, I'm still alive.
<Redacted> lobs= ter!


--
--000000000000552c5206271f1000--