MIME-Version: 1.0
References: <a9f9376f1c3343a6bb319dce294e20ac@EX13D05UWC001.ant.amazon.com>
 <CABUevEwyLb9VE0D+bAQtUnaA7bffXYzBpopYuh7kGTQxY9T5_g@mail.gmail.com>
In-Reply-To: 
 <CABUevEwyLb9VE0D+bAQtUnaA7bffXYzBpopYuh7kGTQxY9T5_g@mail.gmail.com>
From: Robins Tharakan <tharakan@gmail.com>
Date: Tue, 9 Mar 2021 01:13:02 +1100
Message-ID: 
 <CAEP4nAw2WA1wyb9LG7BOEuN3Xr-xWiZZ0w_hKtpyvdUPKmcAJA@mail.gmail.com>
Subject: Re: pg_upgrade failing for 200+ million Large Objects
To: Magnus Hagander <magnus@hagander.net>
Cc: Peter Eisentraut <peter.eisentraut@enterprisedb.com>,
	"pgsql-hackers@postgresql.org" <pgsql-hackers@postgresql.org>
Content-Type: multipart/alternative; boundary="00000000000048410605bd070a07"
Archived-At: 
 <https://www.postgresql.org/message-id/CAEP4nAw2WA1wyb9LG7BOEuN3Xr-xWiZZ0w_hKtpyvdUPKmcAJA%40mail.gmail.com>
Precedence: bulk

--00000000000048410605bd070a07
Content-Type: text/plain; charset="UTF-8"

Hi Magnus,

On Mon, 8 Mar 2021 at 23:34, Magnus Hagander <magnus@hagander.net> wrote:

> AFAICT at a quick check, pg_dump in binary upgrade mode emits one

lo_create() and one ALTER ... OWNER TO for each large object - so with
> 500M large objects that would be a billion statements, and thus a
> billion xids. And without checking, I'm fairly sure it doesn't load in
> a single transaction...
>

Your assumptions are pretty much correct.

The issue isn't with pg_upgrade itself. During pg_restore, each Large
Object (and separately each ALTER LARGE OBJECT OWNER TO) consumes an XID
each. For background, that's the reason the v9.5 production instance I was
reviewing, was unable to process more than 73 Million large objects since
each object required a CREATE + ALTER. (To clarify, 73 million = (2^31 - 2
billion magic constant - 1 Million wraparound protection) / 2)


Without looking, I would guess it's the schema reload using
> pg_dump/pg_restore and not actually pg_upgrade itself. This is a known
> issue in pg_dump/pg_restore. And if that is the case -- perhaps just
> running all of those in a single transaction would be a better choice?
> One could argue it's still not a proper fix, because we'd still have a
> huge memory usage etc, but it would then only burn 1 xid instead of
> 500M...
>
(I hope I am not missing something but) When I tried to force pg_restore to
use a single transaction (by hacking pg_upgrade's pg_restore call to use
--single-transaction), it too failed owing to being unable to lock so many
objects in a single transaction.


This still seems to just fix the symptoms and not the actual problem.
>

I agree that the patch doesn't address the root-cause, but it did get the
upgrade to complete on a test-setup. Do you think that (instead of all
objects) batching multiple Large Objects in a single transaction (and
allowing the caller to size that batch via command line) would be a good /
acceptable idea here?

Please take a look at your email configuration -- all your emails are
> lacking both References and In-reply-to headers.
>

Thanks for highlighting the cause here. Hopefully switching mail clients
would help.
-
Robins Tharakan

--00000000000048410605bd070a07
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div dir=3D"ltr"><div class=3D"gmail_default" style=3D""><=
font face=3D"monospace">Hi Magnus,</font></div><div class=3D"gmail_default"=
 style=3D""><font face=3D"monospace"><br></font></div></div><div class=3D"g=
mail_quote"><div dir=3D"ltr" class=3D"gmail_attr"><font face=3D"monospace">=
On Mon, 8 Mar 2021 at 23:34, Magnus Hagander &lt;<a href=3D"mailto:magnus@h=
agander.net">magnus@hagander.net</a>&gt; wrote:<br></font></div><blockquote=
 class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px so=
lid rgb(204,204,204);padding-left:1ex"><font face=3D"monospace"><span class=
=3D"gmail_default" style=3D""></span>AFAICT at a quick check, pg_dump in bi=
nary upgrade mode emits one</font></blockquote><blockquote class=3D"gmail_q=
uote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,2=
04);padding-left:1ex"><font face=3D"monospace">
lo_create() and one ALTER ... OWNER TO for each large object - so with<br>
500M large objects that would be a billion statements, and thus a<br>
billion xids. And without checking, I&#39;m fairly sure it doesn&#39;t load=
 in<br>
a single transaction...<br></font></blockquote><div><font face=3D"monospace=
"><br></font></div><div class=3D"gmail_default" style=3D""><font face=3D"mo=
nospace">Your assumptions are pretty much correct.</font></div><div class=
=3D"gmail_default" style=3D""><font face=3D"monospace"><br></font></div><di=
v class=3D"gmail_default" style=3D""><font face=3D"monospace">The issue isn=
&#39;t=C2=A0with pg_upgrade itself. During pg_restore, each Large Object (a=
nd separately each ALTER LARGE OBJECT OWNER TO) consumes an XID each. For b=
ackground, that&#39;s the reason the=C2=A0v9.5 production instance I was re=
viewing, was unable to process more than 73 Million large objects since eac=
h object required a CREATE + ALTER. (To clarify, 73 million =3D (2^31 - 2 b=
illion magic constant - 1 Million wraparound protection) / 2)</font></div><=
/div><div class=3D"gmail_quote"><font face=3D"monospace"><br></font></div><=
div class=3D"gmail_quote"><font face=3D"monospace"><br></font></div><div cl=
ass=3D"gmail_quote"><blockquote class=3D"gmail_quote" style=3D"margin:0px 0=
px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><font=
 face=3D"monospace"><span class=3D"gmail_default" style=3D""></span>Without=
 looking, I would guess it&#39;s the schema reload using<br>
pg_dump/pg_restore and not actually pg_upgrade itself. This is a known<br>
issue in pg_dump/pg_restore. And if that is the case -- perhaps just<br>
running all of those in a single transaction would be a better choice?<br>
One could argue it&#39;s still not a proper fix, because we&#39;d still hav=
e a<br>
huge memory usage etc, but it would then only burn 1 xid instead of<br>
500M...<br></font></blockquote><div><span class=3D"gmail_default" style=3D"=
"><font face=3D"monospace"></font></span></div><div><font face=3D"monospace=
"><span class=3D"gmail_default" style=3D"font-family:monospace,monospace">(=
I hope I am not missing something but) When </span><span class=3D"gmail_def=
ault" style=3D"">I </span>tried to force pg_restore to use a single transac=
tion (by hacking pg_upgrade&#39;s pg_restore call to use --single-transacti=
on), it too failed owing to being unable to lock so many objects in a singl=
e transaction.</font></div><div><font face=3D"monospace"><br></font></div><=
div><font face=3D"monospace"><br></font></div><div class=3D"gmail_default" =
style=3D""></div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px =
0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><font fa=
ce=3D"monospace">This still seems to just fix the symptoms and not the actu=
al problem.<br></font></blockquote><div><span class=3D"gmail_default" style=
=3D""><font face=3D"monospace"><br></font></span></div><div><span class=3D"=
gmail_default" style=3D""><font face=3D"monospace">I agree that the patch d=
oesn&#39;t address the root-cause, but it did get the upgrade to complete o=
n a test-setup. Do you think that (instead of all objects) batching multipl=
e Large Objects in a single transaction (and allowing the caller to size th=
at batch via command line) would be a good / acceptable idea here?</font></=
span></div><div><br></div><div class=3D"gmail_default" style=3D""><font fac=
e=3D"monospace"></font></div><div class=3D"gmail_default" style=3D""><block=
quote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1=
px solid rgb(204,204,204);padding-left:1ex"><font face=3D"monospace">Please=
 take a look at your email configuration -- all your emails are<br>
lacking both References and In-reply-to headers.<br></font></blockquote><di=
v style=3D""><span class=3D"gmail_default"><font face=3D"monospace"><br></f=
ont></span></div><div style=3D""><span class=3D"gmail_default"><font face=
=3D"monospace">Thanks for highlighting the cause here. Hopefully switching =
mail clients would help.</font></span></div><div style=3D""><span style=3D"=
font-family:monospace">-</span><br></div><div style=3D""><span class=3D"gma=
il_default" style=3D""><font face=3D"monospace">Robins Tharakan</font></spa=
n></div></div></div></div>

--00000000000048410605bd070a07--