public inbox for [email protected]
help / color / mirror / Atom feedRe: pg_upgrade failing for 200+ million Large Objects
43+ messages / 8 participants
[nested] [flat]
* Re: pg_upgrade failing for 200+ million Large Objects
@ 2021-03-08 11:02 Tharakan, Robins <[email protected]>
2021-03-08 12:33 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
0 siblings, 1 reply; 43+ messages in thread
From: Tharakan, Robins @ 2021-03-08 11:02 UTC (permalink / raw)
To: Peter Eisentraut <[email protected]>; +Cc: pgsql-hackers
Thanks Peter.
The original email [1] had some more context that somehow didn't get
associated with this recent email. Apologies for any confusion.
In short, pg_resetxlog (and pg_resetwal) employs a magic constant [2] (for
both v9.6 as well as master) which seems to have been selected to force an
aggressive autovacuum as soon as the upgrade completes. Although that works
as planned, it narrows the window of Transaction IDs available for the
upgrade (before which XID wraparound protection kicks and aborts the
upgrade) to 146 Million.
Reducing this magic constant allows a larger XID window, which is what the
patch is trying to do. With the patch, I was able to upgrade a cluster with
500m Large Objects successfully (which otherwise reliably fails). In the
original email [1] I had also listed a few other possible workarounds, but
was unsure which would be a good direction to start working on.... thus this
patch to make a start.
Reference:
1) https://www.postgresql.org/message-id/12601596dbbc4c01b86b4ac4d2bd4d48%40
EX13D05UWC001.ant.amazon.com
2) https://github.com/postgres/postgres/blob/master/src/bin/pg_resetwal/pg_r
esetwal.c#L444
-
robins | tharar@ | syd12
> -----Original Message-----
> From: Peter Eisentraut <[email protected]>
> Sent: Monday, 8 March 2021 9:25 PM
> To: Tharakan, Robins <[email protected]>; [email protected]
> Subject: [EXTERNAL] [UNVERIFIED SENDER] Re: pg_upgrade failing for 200+
> million Large Objects
>
> CAUTION: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and
> know the content is safe.
>
>
>
> On 07.03.21 09:43, Tharakan, Robins wrote:
> > Attached is a proof-of-concept patch that allows Postgres to perform
> > pg_upgrade if the instance has Millions of objects.
> >
> > It would be great if someone could take a look and see if this patch
> > is in the right direction. There are some pending tasks (such as
> > documentation / pg_resetxlog vs pg_resetwal related changes) but for
> > now, the patch helps remove a stalemate where if a Postgres instance
> > has a large number (accurately speaking 146+ Million) of Large
> > Objects, pg_upgrade fails. This is easily reproducible and besides
> > deleting Large Objects before upgrade, there is no other (apparent) way
> for pg_upgrade to complete.
> >
> > The patch (attached):
> > - Applies cleanly on REL9_6_STABLE -
> > c7a4fc3dd001646d5938687ad59ab84545d5d043
> > - 'make check' passes
> > - Allows the user to provide a constant via pg_upgrade command-line,
> > that overrides the 2 billion constant in pg_resetxlog [1] thereby
> > increasing the (window of) Transaction IDs available for pg_upgrade to
> complete.
>
> Could you explain what your analysis of the problem is and why this patch
> (might) fix it?
>
> Right now, all I see here is, pass a big number via a command-line option
> and hope it works.
Attachments:
[application/pkcs7-signature] smime.p7s (5.0K, 2-smime.p7s)
download
^ permalink raw reply [nested|flat] 43+ messages in thread
* Re: pg_upgrade failing for 200+ million Large Objects
2021-03-08 11:02 Re: pg_upgrade failing for 200+ million Large Objects Tharakan, Robins <[email protected]>
@ 2021-03-08 12:33 ` Magnus Hagander <[email protected]>
2021-03-08 14:13 ` Re: pg_upgrade failing for 200+ million Large Objects Robins Tharakan <[email protected]>
0 siblings, 1 reply; 43+ messages in thread
From: Magnus Hagander @ 2021-03-08 12:33 UTC (permalink / raw)
To: Tharakan, Robins <[email protected]>; +Cc: Peter Eisentraut <[email protected]>; pgsql-hackers
On Mon, Mar 8, 2021 at 12:02 PM Tharakan, Robins <[email protected]> wrote:
>
> Thanks Peter.
>
> The original email [1] had some more context that somehow didn't get
> associated with this recent email. Apologies for any confusion.
Please take a look at your email configuration -- all your emails are
lacking both References and In-reply-to headers, so every email starts
a new thread, both for each reader and in the archives. It seems quite
broken. It makes it very hard to follow.
> In short, pg_resetxlog (and pg_resetwal) employs a magic constant [2] (for
> both v9.6 as well as master) which seems to have been selected to force an
> aggressive autovacuum as soon as the upgrade completes. Although that works
> as planned, it narrows the window of Transaction IDs available for the
> upgrade (before which XID wraparound protection kicks and aborts the
> upgrade) to 146 Million.
>
> Reducing this magic constant allows a larger XID window, which is what the
> patch is trying to do. With the patch, I was able to upgrade a cluster with
> 500m Large Objects successfully (which otherwise reliably fails). In the
> original email [1] I had also listed a few other possible workarounds, but
> was unsure which would be a good direction to start working on.... thus this
> patch to make a start.
This still seems to just fix the symptoms and not the actual problem.
What part of the pg_upgrade process is it that actually burns through
that many transactions?
Without looking, I would guess it's the schema reload using
pg_dump/pg_restore and not actually pg_upgrade itself. This is a known
issue in pg_dump/pg_restore. And if that is the case -- perhaps just
running all of those in a single transaction would be a better choice?
One could argue it's still not a proper fix, because we'd still have a
huge memory usage etc, but it would then only burn 1 xid instead of
500M...
AFAICT at a quick check, pg_dump in binary upgrade mode emits one
lo_create() and one ALTER ... OWNER TO for each large object - so with
500M large objects that would be a billion statements, and thus a
billion xids. And without checking, I'm fairly sure it doesn't load in
a single transaction...
--
Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/
^ permalink raw reply [nested|flat] 43+ messages in thread
* Re: pg_upgrade failing for 200+ million Large Objects
2021-03-08 11:02 Re: pg_upgrade failing for 200+ million Large Objects Tharakan, Robins <[email protected]>
2021-03-08 12:33 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
@ 2021-03-08 14:13 ` Robins Tharakan <[email protected]>
2021-03-08 16:33 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
0 siblings, 1 reply; 43+ messages in thread
From: Robins Tharakan @ 2021-03-08 14:13 UTC (permalink / raw)
To: Magnus Hagander <[email protected]>; +Cc: Peter Eisentraut <[email protected]>; pgsql-hackers
Hi Magnus,
On Mon, 8 Mar 2021 at 23:34, Magnus Hagander <[email protected]> wrote:
> AFAICT at a quick check, pg_dump in binary upgrade mode emits one
lo_create() and one ALTER ... OWNER TO for each large object - so with
> 500M large objects that would be a billion statements, and thus a
> billion xids. And without checking, I'm fairly sure it doesn't load in
> a single transaction...
>
Your assumptions are pretty much correct.
The issue isn't with pg_upgrade itself. During pg_restore, each Large
Object (and separately each ALTER LARGE OBJECT OWNER TO) consumes an XID
each. For background, that's the reason the v9.5 production instance I was
reviewing, was unable to process more than 73 Million large objects since
each object required a CREATE + ALTER. (To clarify, 73 million = (2^31 - 2
billion magic constant - 1 Million wraparound protection) / 2)
Without looking, I would guess it's the schema reload using
> pg_dump/pg_restore and not actually pg_upgrade itself. This is a known
> issue in pg_dump/pg_restore. And if that is the case -- perhaps just
> running all of those in a single transaction would be a better choice?
> One could argue it's still not a proper fix, because we'd still have a
> huge memory usage etc, but it would then only burn 1 xid instead of
> 500M...
>
(I hope I am not missing something but) When I tried to force pg_restore to
use a single transaction (by hacking pg_upgrade's pg_restore call to use
--single-transaction), it too failed owing to being unable to lock so many
objects in a single transaction.
This still seems to just fix the symptoms and not the actual problem.
>
I agree that the patch doesn't address the root-cause, but it did get the
upgrade to complete on a test-setup. Do you think that (instead of all
objects) batching multiple Large Objects in a single transaction (and
allowing the caller to size that batch via command line) would be a good /
acceptable idea here?
Please take a look at your email configuration -- all your emails are
> lacking both References and In-reply-to headers.
>
Thanks for highlighting the cause here. Hopefully switching mail clients
would help.
-
Robins Tharakan
^ permalink raw reply [nested|flat] 43+ messages in thread
* Re: pg_upgrade failing for 200+ million Large Objects
2021-03-08 11:02 Re: pg_upgrade failing for 200+ million Large Objects Tharakan, Robins <[email protected]>
2021-03-08 12:33 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 14:13 ` Re: pg_upgrade failing for 200+ million Large Objects Robins Tharakan <[email protected]>
@ 2021-03-08 16:33 ` Tom Lane <[email protected]>
2021-03-08 16:35 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
0 siblings, 1 reply; 43+ messages in thread
From: Tom Lane @ 2021-03-08 16:33 UTC (permalink / raw)
To: Robins Tharakan <[email protected]>; +Cc: Magnus Hagander <[email protected]>; Peter Eisentraut <[email protected]>; pgsql-hackers
Robins Tharakan <[email protected]> writes:
> On Mon, 8 Mar 2021 at 23:34, Magnus Hagander <[email protected]> wrote:
>> Without looking, I would guess it's the schema reload using
>> pg_dump/pg_restore and not actually pg_upgrade itself. This is a known
>> issue in pg_dump/pg_restore. And if that is the case -- perhaps just
>> running all of those in a single transaction would be a better choice?
>> One could argue it's still not a proper fix, because we'd still have a
>> huge memory usage etc, but it would then only burn 1 xid instead of
>> 500M...
> (I hope I am not missing something but) When I tried to force pg_restore to
> use a single transaction (by hacking pg_upgrade's pg_restore call to use
> --single-transaction), it too failed owing to being unable to lock so many
> objects in a single transaction.
It does seem that --single-transaction is a better idea than fiddling with
the transaction wraparound parameters, since the latter is just going to
put off the onset of trouble. However, we'd have to do something about
the lock consumption. Would it be sane to have the backend not bother to
take any locks in binary-upgrade mode?
regards, tom lane
^ permalink raw reply [nested|flat] 43+ messages in thread
* Re: pg_upgrade failing for 200+ million Large Objects
2021-03-08 11:02 Re: pg_upgrade failing for 200+ million Large Objects Tharakan, Robins <[email protected]>
2021-03-08 12:33 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 14:13 ` Re: pg_upgrade failing for 200+ million Large Objects Robins Tharakan <[email protected]>
2021-03-08 16:33 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
@ 2021-03-08 16:35 ` Magnus Hagander <[email protected]>
2021-03-08 16:58 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
0 siblings, 1 reply; 43+ messages in thread
From: Magnus Hagander @ 2021-03-08 16:35 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: Robins Tharakan <[email protected]>; Peter Eisentraut <[email protected]>; pgsql-hackers
On Mon, Mar 8, 2021 at 5:33 PM Tom Lane <[email protected]> wrote:
>
> Robins Tharakan <[email protected]> writes:
> > On Mon, 8 Mar 2021 at 23:34, Magnus Hagander <[email protected]> wrote:
> >> Without looking, I would guess it's the schema reload using
> >> pg_dump/pg_restore and not actually pg_upgrade itself. This is a known
> >> issue in pg_dump/pg_restore. And if that is the case -- perhaps just
> >> running all of those in a single transaction would be a better choice?
> >> One could argue it's still not a proper fix, because we'd still have a
> >> huge memory usage etc, but it would then only burn 1 xid instead of
> >> 500M...
>
> > (I hope I am not missing something but) When I tried to force pg_restore to
> > use a single transaction (by hacking pg_upgrade's pg_restore call to use
> > --single-transaction), it too failed owing to being unable to lock so many
> > objects in a single transaction.
>
> It does seem that --single-transaction is a better idea than fiddling with
> the transaction wraparound parameters, since the latter is just going to
> put off the onset of trouble. However, we'd have to do something about
> the lock consumption. Would it be sane to have the backend not bother to
> take any locks in binary-upgrade mode?
I believe the problem occurs when writing them rather than when
reading them, and I don't think we have a binary upgrade mode there.
We could invent one of course. Another option might be to exclusively
lock pg_largeobject, and just say that if you do that, we don't have
to lock the individual objects (ever)?
--
Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/
^ permalink raw reply [nested|flat] 43+ messages in thread
* Re: pg_upgrade failing for 200+ million Large Objects
2021-03-08 11:02 Re: pg_upgrade failing for 200+ million Large Objects Tharakan, Robins <[email protected]>
2021-03-08 12:33 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 14:13 ` Re: pg_upgrade failing for 200+ million Large Objects Robins Tharakan <[email protected]>
2021-03-08 16:33 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-08 16:35 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
@ 2021-03-08 16:58 ` Tom Lane <[email protected]>
2021-03-08 17:18 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-20 04:39 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
0 siblings, 2 replies; 43+ messages in thread
From: Tom Lane @ 2021-03-08 16:58 UTC (permalink / raw)
To: Magnus Hagander <[email protected]>; +Cc: Robins Tharakan <[email protected]>; Peter Eisentraut <[email protected]>; pgsql-hackers
Magnus Hagander <[email protected]> writes:
> On Mon, Mar 8, 2021 at 5:33 PM Tom Lane <[email protected]> wrote:
>> It does seem that --single-transaction is a better idea than fiddling with
>> the transaction wraparound parameters, since the latter is just going to
>> put off the onset of trouble. However, we'd have to do something about
>> the lock consumption. Would it be sane to have the backend not bother to
>> take any locks in binary-upgrade mode?
> I believe the problem occurs when writing them rather than when
> reading them, and I don't think we have a binary upgrade mode there.
You're confusing pg_dump's --binary-upgrade switch (indeed applied on
the dumping side) with the backend's -b switch (IsBinaryUpgrade,
applied on the restoring side).
> We could invent one of course. Another option might be to exclusively
> lock pg_largeobject, and just say that if you do that, we don't have
> to lock the individual objects (ever)?
What was in the back of my mind is that we've sometimes seen complaints
about too many locks needed to dump or restore a database with $MANY
tables; so the large-object case seems like just a special case.
The answer up to now has been "raise max_locks_per_transaction enough
so you don't see the failure". Having now consumed a little more
caffeine, I remember that that works in pg_upgrade scenarios too,
since the user can fiddle with the target cluster's postgresql.conf
before starting pg_upgrade.
So it seems like the path of least resistance is
(a) make pg_upgrade use --single-transaction when calling pg_restore
(b) document (better) how to get around too-many-locks failures.
regards, tom lane
^ permalink raw reply [nested|flat] 43+ messages in thread
* Re: pg_upgrade failing for 200+ million Large Objects
2021-03-08 11:02 Re: pg_upgrade failing for 200+ million Large Objects Tharakan, Robins <[email protected]>
2021-03-08 12:33 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 14:13 ` Re: pg_upgrade failing for 200+ million Large Objects Robins Tharakan <[email protected]>
2021-03-08 16:33 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-08 16:35 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 16:58 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
@ 2021-03-08 17:18 ` Magnus Hagander <[email protected]>
1 sibling, 0 replies; 43+ messages in thread
From: Magnus Hagander @ 2021-03-08 17:18 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: Robins Tharakan <[email protected]>; Peter Eisentraut <[email protected]>; pgsql-hackers
On Mon, Mar 8, 2021 at 5:58 PM Tom Lane <[email protected]> wrote:
>
> Magnus Hagander <[email protected]> writes:
> > On Mon, Mar 8, 2021 at 5:33 PM Tom Lane <[email protected]> wrote:
> >> It does seem that --single-transaction is a better idea than fiddling with
> >> the transaction wraparound parameters, since the latter is just going to
> >> put off the onset of trouble. However, we'd have to do something about
> >> the lock consumption. Would it be sane to have the backend not bother to
> >> take any locks in binary-upgrade mode?
>
> > I believe the problem occurs when writing them rather than when
> > reading them, and I don't think we have a binary upgrade mode there.
>
> You're confusing pg_dump's --binary-upgrade switch (indeed applied on
> the dumping side) with the backend's -b switch (IsBinaryUpgrade,
> applied on the restoring side).
Ah. Yes, I am.
> > We could invent one of course. Another option might be to exclusively
> > lock pg_largeobject, and just say that if you do that, we don't have
> > to lock the individual objects (ever)?
>
> What was in the back of my mind is that we've sometimes seen complaints
> about too many locks needed to dump or restore a database with $MANY
> tables; so the large-object case seems like just a special case.
It is -- but I guess it's more likely to have 100M large objects than
to have 100M tables. (and the cutoff point comes a lot earlier than
100M). But the fundamental onei s the same.
> The answer up to now has been "raise max_locks_per_transaction enough
> so you don't see the failure". Having now consumed a little more
> caffeine, I remember that that works in pg_upgrade scenarios too,
> since the user can fiddle with the target cluster's postgresql.conf
> before starting pg_upgrade.
>
> So it seems like the path of least resistance is
>
> (a) make pg_upgrade use --single-transaction when calling pg_restore
>
> (b) document (better) how to get around too-many-locks failures.
Agreed. Certainly seems like a better path forward than arbitrarily
pushing the limit on number of transactions which just postpones the
problem.
--
Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/
^ permalink raw reply [nested|flat] 43+ messages in thread
* Re: pg_upgrade failing for 200+ million Large Objects
2021-03-08 11:02 Re: pg_upgrade failing for 200+ million Large Objects Tharakan, Robins <[email protected]>
2021-03-08 12:33 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 14:13 ` Re: pg_upgrade failing for 200+ million Large Objects Robins Tharakan <[email protected]>
2021-03-08 16:33 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-08 16:35 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 16:58 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
@ 2021-03-20 04:39 ` Jan Wieck <[email protected]>
2021-03-20 15:17 ` Re: pg_upgrade failing for 200+ million Large Objects Andrew Dunstan <[email protected]>
2021-03-20 15:23 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-21 16:50 ` Fix pg_upgrade to preserve datdba (was: Re: pg_upgrade failing for 200+ million Large Objects) Jan Wieck <[email protected]>
1 sibling, 3 replies; 43+ messages in thread
From: Jan Wieck @ 2021-03-20 04:39 UTC (permalink / raw)
To: Tom Lane <[email protected]>; Magnus Hagander <[email protected]>; +Cc: Robins Tharakan <[email protected]>; Peter Eisentraut <[email protected]>; pgsql-hackers
On 3/8/21 11:58 AM, Tom Lane wrote:
> The answer up to now has been "raise max_locks_per_transaction enough
> so you don't see the failure". Having now consumed a little more
> caffeine, I remember that that works in pg_upgrade scenarios too,
> since the user can fiddle with the target cluster's postgresql.conf
> before starting pg_upgrade.
>
> So it seems like the path of least resistance is
>
> (a) make pg_upgrade use --single-transaction when calling pg_restore
>
> (b) document (better) how to get around too-many-locks failures.
That would first require to fix how pg_upgrade is creating the
databases. It uses "pg_restore --create", which is mutually exclusive
with --single-transaction because we cannot create a database inside of
a transaction. On the way pg_upgrade also mangles the pg_database.datdba
(all databases are owned by postgres after an upgrade; will submit a
separate patch for that as I consider that a bug by itself).
All that aside, the entire approach doesn't scale.
In a hacked up pg_upgrade that does "createdb" first before calling
pg_upgrade with --single-transaction. I can upgrade 1M large objects with
max_locks_per_transaction = 5300
max_connectinons=100
which contradicts the docs. Need to find out where that math went off
the rails because that config should only have room for 530,000 locks,
not 1M. The same test fails with max_locks_per_transaction = 5200.
But this would mean that one has to modify the postgresql.conf to
something like 530,000 max_locks_per_transaction at 100 max_connections
in order to actually run a successful upgrade of 100M large objects.
This config requires 26GB of memory just for locks. Add to that the
memory pg_restore needs to load the entire TOC before even restoring a
single object.
Not going to work. But tests are still ongoing ...
Regards, Jan
--
Jan Wieck
Principle Database Engineer
Amazon Web Services
^ permalink raw reply [nested|flat] 43+ messages in thread
* Re: pg_upgrade failing for 200+ million Large Objects
2021-03-08 11:02 Re: pg_upgrade failing for 200+ million Large Objects Tharakan, Robins <[email protected]>
2021-03-08 12:33 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 14:13 ` Re: pg_upgrade failing for 200+ million Large Objects Robins Tharakan <[email protected]>
2021-03-08 16:33 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-08 16:35 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 16:58 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 04:39 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
@ 2021-03-20 15:17 ` Andrew Dunstan <[email protected]>
2 siblings, 0 replies; 43+ messages in thread
From: Andrew Dunstan @ 2021-03-20 15:17 UTC (permalink / raw)
To: Jan Wieck <[email protected]>; Tom Lane <[email protected]>; Magnus Hagander <[email protected]>; +Cc: Robins Tharakan <[email protected]>; Peter Eisentraut <[email protected]>; pgsql-hackers
On 3/20/21 12:39 AM, Jan Wieck wrote:
> On 3/8/21 11:58 AM, Tom Lane wrote:
>> The answer up to now has been "raise max_locks_per_transaction enough
>> so you don't see the failure". Having now consumed a little more
>> caffeine, I remember that that works in pg_upgrade scenarios too,
>> since the user can fiddle with the target cluster's postgresql.conf
>> before starting pg_upgrade.
>>
>> So it seems like the path of least resistance is
>>
>> (a) make pg_upgrade use --single-transaction when calling pg_restore
>>
>> (b) document (better) how to get around too-many-locks failures.
>
> That would first require to fix how pg_upgrade is creating the
> databases. It uses "pg_restore --create", which is mutually exclusive
> with --single-transaction because we cannot create a database inside
> of a transaction. On the way pg_upgrade also mangles the
> pg_database.datdba (all databases are owned by postgres after an
> upgrade; will submit a separate patch for that as I consider that a
> bug by itself).
>
> All that aside, the entire approach doesn't scale.
>
> In a hacked up pg_upgrade that does "createdb" first before calling
> pg_upgrade with --single-transaction. I can upgrade 1M large objects with
> max_locks_per_transaction = 5300
> max_connectinons=100
> which contradicts the docs. Need to find out where that math went off
> the rails because that config should only have room for 530,000 locks,
> not 1M. The same test fails with max_locks_per_transaction = 5200.
>
> But this would mean that one has to modify the postgresql.conf to
> something like 530,000 max_locks_per_transaction at 100
> max_connections in order to actually run a successful upgrade of 100M
> large objects. This config requires 26GB of memory just for locks. Add
> to that the memory pg_restore needs to load the entire TOC before even
> restoring a single object.
>
> Not going to work. But tests are still ongoing ...
I thought Tom's suggestion upthread:
> Would it be sane to have the backend not bother to
> take any locks in binary-upgrade mode?
was interesting. Could we do that on the restore side? After all, what
are we locking against in binary upgrade mode?
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
^ permalink raw reply [nested|flat] 43+ messages in thread
* Re: pg_upgrade failing for 200+ million Large Objects
2021-03-08 11:02 Re: pg_upgrade failing for 200+ million Large Objects Tharakan, Robins <[email protected]>
2021-03-08 12:33 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 14:13 ` Re: pg_upgrade failing for 200+ million Large Objects Robins Tharakan <[email protected]>
2021-03-08 16:33 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-08 16:35 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 16:58 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 04:39 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
@ 2021-03-20 15:23 ` Tom Lane <[email protected]>
2021-03-20 16:45 ` Re: pg_upgrade failing for 200+ million Large Objects Bruce Momjian <[email protected]>
2021-03-20 16:55 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2 siblings, 2 replies; 43+ messages in thread
From: Tom Lane @ 2021-03-20 15:23 UTC (permalink / raw)
To: Jan Wieck <[email protected]>; +Cc: Magnus Hagander <[email protected]>; Robins Tharakan <[email protected]>; Peter Eisentraut <[email protected]>; pgsql-hackers
Jan Wieck <[email protected]> writes:
> On 3/8/21 11:58 AM, Tom Lane wrote:
>> So it seems like the path of least resistance is
>> (a) make pg_upgrade use --single-transaction when calling pg_restore
>> (b) document (better) how to get around too-many-locks failures.
> That would first require to fix how pg_upgrade is creating the
> databases. It uses "pg_restore --create", which is mutually exclusive
> with --single-transaction because we cannot create a database inside of
> a transaction.
Ugh.
> All that aside, the entire approach doesn't scale.
Yeah, agreed. When we gave large objects individual ownership and ACL
info, it was argued that pg_dump could afford to treat each one as a
separate TOC entry because "you wouldn't have that many of them, if
they're large". The limits of that approach were obvious even at the
time, and I think now we're starting to see people for whom it really
doesn't work.
I wonder if pg_dump could improve matters cheaply by aggregating the
large objects by owner and ACL contents. That is, do
select distinct lomowner, lomacl from pg_largeobject_metadata;
and make just *one* BLOB TOC entry for each result. Then dump out
all the matching blobs under that heading.
A possible objection is that it'd reduce the ability to restore blobs
selectively, so maybe we'd need to make it optional.
Of course, that just reduces the memory consumption on the client
side; it does nothing for the locks. Can we get away with releasing the
lock immediately after doing an ALTER OWNER or GRANT/REVOKE on a blob?
regards, tom lane
^ permalink raw reply [nested|flat] 43+ messages in thread
* Re: pg_upgrade failing for 200+ million Large Objects
2021-03-08 11:02 Re: pg_upgrade failing for 200+ million Large Objects Tharakan, Robins <[email protected]>
2021-03-08 12:33 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 14:13 ` Re: pg_upgrade failing for 200+ million Large Objects Robins Tharakan <[email protected]>
2021-03-08 16:33 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-08 16:35 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 16:58 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 04:39 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-20 15:23 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
@ 2021-03-20 16:45 ` Bruce Momjian <[email protected]>
2021-03-20 16:53 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
1 sibling, 1 reply; 43+ messages in thread
From: Bruce Momjian @ 2021-03-20 16:45 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: Jan Wieck <[email protected]>; Magnus Hagander <[email protected]>; Robins Tharakan <[email protected]>; Peter Eisentraut <[email protected]>; pgsql-hackers
On Sat, Mar 20, 2021 at 11:23:19AM -0400, Tom Lane wrote:
> I wonder if pg_dump could improve matters cheaply by aggregating the
> large objects by owner and ACL contents. That is, do
>
> select distinct lomowner, lomacl from pg_largeobject_metadata;
>
> and make just *one* BLOB TOC entry for each result. Then dump out
> all the matching blobs under that heading.
>
> A possible objection is that it'd reduce the ability to restore blobs
> selectively, so maybe we'd need to make it optional.
>
> Of course, that just reduces the memory consumption on the client
> side; it does nothing for the locks. Can we get away with releasing the
> lock immediately after doing an ALTER OWNER or GRANT/REVOKE on a blob?
Well, in pg_upgrade mode you can, since there are no other cluster
users, but you might be asking for general pg_dump usage.
--
Bruce Momjian <[email protected]> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
^ permalink raw reply [nested|flat] 43+ messages in thread
* Re: pg_upgrade failing for 200+ million Large Objects
2021-03-08 11:02 Re: pg_upgrade failing for 200+ million Large Objects Tharakan, Robins <[email protected]>
2021-03-08 12:33 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 14:13 ` Re: pg_upgrade failing for 200+ million Large Objects Robins Tharakan <[email protected]>
2021-03-08 16:33 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-08 16:35 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 16:58 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 04:39 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-20 15:23 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 16:45 ` Re: pg_upgrade failing for 200+ million Large Objects Bruce Momjian <[email protected]>
@ 2021-03-20 16:53 ` Tom Lane <[email protected]>
0 siblings, 0 replies; 43+ messages in thread
From: Tom Lane @ 2021-03-20 16:53 UTC (permalink / raw)
To: Bruce Momjian <[email protected]>; +Cc: Jan Wieck <[email protected]>; Magnus Hagander <[email protected]>; Robins Tharakan <[email protected]>; Peter Eisentraut <[email protected]>; pgsql-hackers
Bruce Momjian <[email protected]> writes:
> On Sat, Mar 20, 2021 at 11:23:19AM -0400, Tom Lane wrote:
>> Of course, that just reduces the memory consumption on the client
>> side; it does nothing for the locks. Can we get away with releasing the
>> lock immediately after doing an ALTER OWNER or GRANT/REVOKE on a blob?
> Well, in pg_upgrade mode you can, since there are no other cluster
> users, but you might be asking for general pg_dump usage.
Yeah, this problem doesn't only affect pg_upgrade scenarios, so it'd
really be better to find a way that isn't dependent on binary-upgrade
mode.
regards, tom lane
^ permalink raw reply [nested|flat] 43+ messages in thread
* Re: pg_upgrade failing for 200+ million Large Objects
2021-03-08 11:02 Re: pg_upgrade failing for 200+ million Large Objects Tharakan, Robins <[email protected]>
2021-03-08 12:33 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 14:13 ` Re: pg_upgrade failing for 200+ million Large Objects Robins Tharakan <[email protected]>
2021-03-08 16:33 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-08 16:35 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 16:58 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 04:39 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-20 15:23 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
@ 2021-03-20 16:55 ` Jan Wieck <[email protected]>
2021-03-21 11:47 ` Re: pg_upgrade failing for 200+ million Large Objects Andrew Dunstan <[email protected]>
1 sibling, 1 reply; 43+ messages in thread
From: Jan Wieck @ 2021-03-20 16:55 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: Magnus Hagander <[email protected]>; Robins Tharakan <[email protected]>; Peter Eisentraut <[email protected]>; pgsql-hackers
On 3/20/21 11:23 AM, Tom Lane wrote:
> Jan Wieck <[email protected]> writes:
>> All that aside, the entire approach doesn't scale.
>
> Yeah, agreed. When we gave large objects individual ownership and ACL
> info, it was argued that pg_dump could afford to treat each one as a
> separate TOC entry because "you wouldn't have that many of them, if
> they're large". The limits of that approach were obvious even at the
> time, and I think now we're starting to see people for whom it really
> doesn't work.
It actually looks more like some users have millions of "small objects".
I am still wondering where that is coming from and why they are abusing
LOs in that way, but that is more out of curiosity. Fact is that they
are out there and that they cannot upgrade from their 9.5 databases,
which are now past EOL.
>
> I wonder if pg_dump could improve matters cheaply by aggregating the
> large objects by owner and ACL contents. That is, do
>
> select distinct lomowner, lomacl from pg_largeobject_metadata;
>
> and make just *one* BLOB TOC entry for each result. Then dump out
> all the matching blobs under that heading.
What I am currently experimenting with is moving the BLOB TOC entries
into the parallel data phase of pg_restore "when doing binary upgrade".
It seems to scale nicely with the number of cores in the system. In
addition to that have options for pg_upgrade and pg_restore that cause
the restore to batch them into transactions, like 10,000 objects at a
time. There was a separate thread for that but I guess it is better to
keep it all together here now.
>
> A possible objection is that it'd reduce the ability to restore blobs
> selectively, so maybe we'd need to make it optional.
I fully intend to make all this into new "options". I am afraid that
there is no one-size-fits-all solution here.
>
> Of course, that just reduces the memory consumption on the client
> side; it does nothing for the locks. Can we get away with releasing the
> lock immediately after doing an ALTER OWNER or GRANT/REVOKE on a blob?
I'm not very fond of the idea going lockless when at the same time
trying to parallelize the restore phase. That can lead to really nasty
race conditions. For now I'm aiming at batches in transactions.
Regards, Jan
--
Jan Wieck
Principle Database Engineer
Amazon Web Services
^ permalink raw reply [nested|flat] 43+ messages in thread
* Re: pg_upgrade failing for 200+ million Large Objects
2021-03-08 11:02 Re: pg_upgrade failing for 200+ million Large Objects Tharakan, Robins <[email protected]>
2021-03-08 12:33 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 14:13 ` Re: pg_upgrade failing for 200+ million Large Objects Robins Tharakan <[email protected]>
2021-03-08 16:33 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-08 16:35 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 16:58 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 04:39 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-20 15:23 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 16:55 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
@ 2021-03-21 11:47 ` Andrew Dunstan <[email protected]>
2021-03-21 16:56 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
0 siblings, 1 reply; 43+ messages in thread
From: Andrew Dunstan @ 2021-03-21 11:47 UTC (permalink / raw)
To: Jan Wieck <[email protected]>; Tom Lane <[email protected]>; +Cc: Magnus Hagander <[email protected]>; Robins Tharakan <[email protected]>; Peter Eisentraut <[email protected]>; pgsql-hackers
On 3/20/21 12:55 PM, Jan Wieck wrote:
> On 3/20/21 11:23 AM, Tom Lane wrote:
>> Jan Wieck <[email protected]> writes:
>>> All that aside, the entire approach doesn't scale.
>>
>> Yeah, agreed. When we gave large objects individual ownership and ACL
>> info, it was argued that pg_dump could afford to treat each one as a
>> separate TOC entry because "you wouldn't have that many of them, if
>> they're large". The limits of that approach were obvious even at the
>> time, and I think now we're starting to see people for whom it really
>> doesn't work.
>
> It actually looks more like some users have millions of "small
> objects". I am still wondering where that is coming from and why they
> are abusing LOs in that way, but that is more out of curiosity. Fact
> is that they are out there and that they cannot upgrade from their 9.5
> databases, which are now past EOL.
>
One possible (probable?) source is the JDBC driver, which currently
treats all Blobs (and Clobs, for that matter) as LOs. I'm working on
improving that some: <https://github.com/pgjdbc/pgjdbc/pull/2093;
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
^ permalink raw reply [nested|flat] 43+ messages in thread
* Re: pg_upgrade failing for 200+ million Large Objects
2021-03-08 11:02 Re: pg_upgrade failing for 200+ million Large Objects Tharakan, Robins <[email protected]>
2021-03-08 12:33 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 14:13 ` Re: pg_upgrade failing for 200+ million Large Objects Robins Tharakan <[email protected]>
2021-03-08 16:33 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-08 16:35 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 16:58 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 04:39 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-20 15:23 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 16:55 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-21 11:47 ` Re: pg_upgrade failing for 200+ million Large Objects Andrew Dunstan <[email protected]>
@ 2021-03-21 16:56 ` Jan Wieck <[email protected]>
2021-03-21 18:18 ` Re: pg_upgrade failing for 200+ million Large Objects Andrew Dunstan <[email protected]>
0 siblings, 1 reply; 43+ messages in thread
From: Jan Wieck @ 2021-03-21 16:56 UTC (permalink / raw)
To: Andrew Dunstan <[email protected]>; Tom Lane <[email protected]>; +Cc: Magnus Hagander <[email protected]>; Robins Tharakan <[email protected]>; Peter Eisentraut <[email protected]>; pgsql-hackers
On 3/21/21 7:47 AM, Andrew Dunstan wrote:
> One possible (probable?) source is the JDBC driver, which currently
> treats all Blobs (and Clobs, for that matter) as LOs. I'm working on
> improving that some: <https://github.com/pgjdbc/pgjdbc/pull/2093;
You mean the user is using OID columns pointing to large objects and the
JDBC driver is mapping those for streaming operations?
Yeah, that would explain a lot.
Thanks, Jan
--
Jan Wieck
Principle Database Engineer
Amazon Web Services
^ permalink raw reply [nested|flat] 43+ messages in thread
* Re: pg_upgrade failing for 200+ million Large Objects
2021-03-08 11:02 Re: pg_upgrade failing for 200+ million Large Objects Tharakan, Robins <[email protected]>
2021-03-08 12:33 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 14:13 ` Re: pg_upgrade failing for 200+ million Large Objects Robins Tharakan <[email protected]>
2021-03-08 16:33 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-08 16:35 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 16:58 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 04:39 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-20 15:23 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 16:55 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-21 11:47 ` Re: pg_upgrade failing for 200+ million Large Objects Andrew Dunstan <[email protected]>
2021-03-21 16:56 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
@ 2021-03-21 18:18 ` Andrew Dunstan <[email protected]>
2021-03-22 21:36 ` Re: pg_upgrade failing for 200+ million Large Objects Zhihong Yu <[email protected]>
0 siblings, 1 reply; 43+ messages in thread
From: Andrew Dunstan @ 2021-03-21 18:18 UTC (permalink / raw)
To: Jan Wieck <[email protected]>; Tom Lane <[email protected]>; +Cc: Magnus Hagander <[email protected]>; Robins Tharakan <[email protected]>; Peter Eisentraut <[email protected]>; pgsql-hackers
On 3/21/21 12:56 PM, Jan Wieck wrote:
> On 3/21/21 7:47 AM, Andrew Dunstan wrote:
>> One possible (probable?) source is the JDBC driver, which currently
>> treats all Blobs (and Clobs, for that matter) as LOs. I'm working on
>> improving that some: <https://github.com/pgjdbc/pgjdbc/pull/2093;
>
> You mean the user is using OID columns pointing to large objects and
> the JDBC driver is mapping those for streaming operations?
>
> Yeah, that would explain a lot.
>
>
>
Probably in most cases the database is designed by Hibernate, and the
front end programmers know nothing at all of Oids or LOs, they just ask
for and get a Blob.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
^ permalink raw reply [nested|flat] 43+ messages in thread
* Re: pg_upgrade failing for 200+ million Large Objects
2021-03-08 11:02 Re: pg_upgrade failing for 200+ million Large Objects Tharakan, Robins <[email protected]>
2021-03-08 12:33 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 14:13 ` Re: pg_upgrade failing for 200+ million Large Objects Robins Tharakan <[email protected]>
2021-03-08 16:33 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-08 16:35 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 16:58 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 04:39 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-20 15:23 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 16:55 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-21 11:47 ` Re: pg_upgrade failing for 200+ million Large Objects Andrew Dunstan <[email protected]>
2021-03-21 16:56 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-21 18:18 ` Re: pg_upgrade failing for 200+ million Large Objects Andrew Dunstan <[email protected]>
@ 2021-03-22 21:36 ` Zhihong Yu <[email protected]>
2021-03-22 23:18 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
0 siblings, 1 reply; 43+ messages in thread
From: Zhihong Yu @ 2021-03-22 21:36 UTC (permalink / raw)
To: Andrew Dunstan <[email protected]>; +Cc: Jan Wieck <[email protected]>; Tom Lane <[email protected]>; Magnus Hagander <[email protected]>; Robins Tharakan <[email protected]>; Peter Eisentraut <[email protected]>; pgsql-hackers
>
> Hi,
>
w.r.t. pg_upgrade_improvements.v2.diff.
+ blobBatchCount = 0;
+ blobInXact = false;
The count and bool flag are always reset in tandem. It seems
variable blobInXact is not needed.
Cheers
^ permalink raw reply [nested|flat] 43+ messages in thread
* Re: pg_upgrade failing for 200+ million Large Objects
2021-03-08 11:02 Re: pg_upgrade failing for 200+ million Large Objects Tharakan, Robins <[email protected]>
2021-03-08 12:33 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 14:13 ` Re: pg_upgrade failing for 200+ million Large Objects Robins Tharakan <[email protected]>
2021-03-08 16:33 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-08 16:35 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 16:58 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 04:39 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-20 15:23 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 16:55 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-21 11:47 ` Re: pg_upgrade failing for 200+ million Large Objects Andrew Dunstan <[email protected]>
2021-03-21 16:56 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-21 18:18 ` Re: pg_upgrade failing for 200+ million Large Objects Andrew Dunstan <[email protected]>
2021-03-22 21:36 ` Re: pg_upgrade failing for 200+ million Large Objects Zhihong Yu <[email protected]>
@ 2021-03-22 23:18 ` Jan Wieck <[email protected]>
2021-03-23 12:51 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
0 siblings, 1 reply; 43+ messages in thread
From: Jan Wieck @ 2021-03-22 23:18 UTC (permalink / raw)
To: Zhihong Yu <[email protected]>; Andrew Dunstan <[email protected]>; +Cc: Tom Lane <[email protected]>; Magnus Hagander <[email protected]>; Robins Tharakan <[email protected]>; Peter Eisentraut <[email protected]>; pgsql-hackers
On 3/22/21 5:36 PM, Zhihong Yu wrote:
> Hi,
>
> w.r.t. pg_upgrade_improvements.v2.diff.
>
> + blobBatchCount = 0;
> + blobInXact = false;
>
> The count and bool flag are always reset in tandem. It seems
> variable blobInXact is not needed.
You are right. I will fix that.
Thanks, Jan
--
Jan Wieck
Principle Database Engineer
Amazon Web Services
^ permalink raw reply [nested|flat] 43+ messages in thread
* Re: pg_upgrade failing for 200+ million Large Objects
2021-03-08 11:02 Re: pg_upgrade failing for 200+ million Large Objects Tharakan, Robins <[email protected]>
2021-03-08 12:33 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 14:13 ` Re: pg_upgrade failing for 200+ million Large Objects Robins Tharakan <[email protected]>
2021-03-08 16:33 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-08 16:35 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 16:58 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 04:39 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-20 15:23 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 16:55 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-21 11:47 ` Re: pg_upgrade failing for 200+ million Large Objects Andrew Dunstan <[email protected]>
2021-03-21 16:56 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-21 18:18 ` Re: pg_upgrade failing for 200+ million Large Objects Andrew Dunstan <[email protected]>
2021-03-22 21:36 ` Re: pg_upgrade failing for 200+ million Large Objects Zhihong Yu <[email protected]>
2021-03-22 23:18 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
@ 2021-03-23 12:51 ` Jan Wieck <[email protected]>
2021-03-23 14:56 ` Re: pg_upgrade failing for 200+ million Large Objects Bruce Momjian <[email protected]>
0 siblings, 1 reply; 43+ messages in thread
From: Jan Wieck @ 2021-03-23 12:51 UTC (permalink / raw)
To: Zhihong Yu <[email protected]>; Andrew Dunstan <[email protected]>; +Cc: Tom Lane <[email protected]>; Magnus Hagander <[email protected]>; Robins Tharakan <[email protected]>; Peter Eisentraut <[email protected]>; pgsql-hackers
On 3/22/21 7:18 PM, Jan Wieck wrote:
> On 3/22/21 5:36 PM, Zhihong Yu wrote:
>> Hi,
>>
>> w.r.t. pg_upgrade_improvements.v2.diff.
>>
>> + blobBatchCount = 0;
>> + blobInXact = false;
>>
>> The count and bool flag are always reset in tandem. It seems
>> variable blobInXact is not needed.
>
> You are right. I will fix that.
New patch v3 attached.
Thanks, Jan
--
Jan Wieck
Principle Database Engineer
Amazon Web Services
Attachments:
[text/x-patch] pg_upgrade_improvements.v3.diff (22.8K, 2-pg_upgrade_improvements.v3.diff)
download | inline diff:
diff --git a/src/bin/pg_dump/parallel.c b/src/bin/pg_dump/parallel.c
index c7351a4..4a611d0 100644
--- a/src/bin/pg_dump/parallel.c
+++ b/src/bin/pg_dump/parallel.c
@@ -864,6 +864,11 @@ RunWorker(ArchiveHandle *AH, ParallelSlot *slot)
WaitForCommands(AH, pipefd);
/*
+ * Close an eventually open BLOB batch transaction.
+ */
+ CommitBlobTransaction((Archive *)AH);
+
+ /*
* Disconnect from database and clean up.
*/
set_cancel_slot_archive(slot, NULL);
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index 0296b9b..cd8a590 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -203,6 +203,8 @@ typedef struct Archive
int numWorkers; /* number of parallel processes */
char *sync_snapshot_id; /* sync snapshot id for parallel operation */
+ int blobBatchSize; /* # of blobs to restore per transaction */
+
/* info needed for string escaping */
int encoding; /* libpq code for client_encoding */
bool std_strings; /* standard_conforming_strings */
@@ -269,6 +271,7 @@ extern void WriteData(Archive *AH, const void *data, size_t dLen);
extern int StartBlob(Archive *AH, Oid oid);
extern int EndBlob(Archive *AH, Oid oid);
+extern void CommitBlobTransaction(Archive *AH);
extern void CloseArchive(Archive *AH);
extern void SetArchiveOptions(Archive *AH, DumpOptions *dopt, RestoreOptions *ropt);
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 1f82c64..8331e8a 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -68,6 +68,7 @@ typedef struct _parallelReadyList
bool sorted; /* are valid entries currently sorted? */
} ParallelReadyList;
+static int blobBatchCount = 0;
static ArchiveHandle *_allocAH(const char *FileSpec, const ArchiveFormat fmt,
const int compression, bool dosync, ArchiveMode mode,
@@ -265,6 +266,8 @@ CloseArchive(Archive *AHX)
int res = 0;
ArchiveHandle *AH = (ArchiveHandle *) AHX;
+ CommitBlobTransaction(AHX);
+
AH->ClosePtr(AH);
/* Close the output */
@@ -279,6 +282,23 @@ CloseArchive(Archive *AHX)
/* Public */
void
+CommitBlobTransaction(Archive *AHX)
+{
+ ArchiveHandle *AH = (ArchiveHandle *) AHX;
+
+ if (blobBatchCount > 0)
+ {
+ ahprintf(AH, "--\n");
+ ahprintf(AH, "-- End BLOB restore batch\n");
+ ahprintf(AH, "--\n");
+ ahprintf(AH, "COMMIT;\n\n");
+
+ blobBatchCount = 0;
+ }
+}
+
+/* Public */
+void
SetArchiveOptions(Archive *AH, DumpOptions *dopt, RestoreOptions *ropt)
{
/* Caller can omit dump options, in which case we synthesize them */
@@ -3531,6 +3551,57 @@ _printTocEntry(ArchiveHandle *AH, TocEntry *te, bool isData)
{
RestoreOptions *ropt = AH->public.ropt;
+ /* We restore BLOBs in batches to reduce XID consumption */
+ if (strcmp(te->desc, "BLOB") == 0 && AH->public.blobBatchSize > 0)
+ {
+ if (blobBatchCount > 0)
+ {
+ /* We are inside a BLOB restore transaction */
+ if (blobBatchCount >= AH->public.blobBatchSize)
+ {
+ /*
+ * We did reach the batch size with the previous BLOB.
+ * Commit and start a new batch.
+ */
+ ahprintf(AH, "--\n");
+ ahprintf(AH, "-- BLOB batch size reached\n");
+ ahprintf(AH, "--\n");
+ ahprintf(AH, "COMMIT;\n");
+ ahprintf(AH, "BEGIN;\n\n");
+
+ blobBatchCount = 1;
+ }
+ else
+ {
+ /* This one still fits into the current batch */
+ blobBatchCount++;
+ }
+ }
+ else
+ {
+ /* Not inside a transaction, start a new batch */
+ ahprintf(AH, "--\n");
+ ahprintf(AH, "-- Start BLOB restore batch\n");
+ ahprintf(AH, "--\n");
+ ahprintf(AH, "BEGIN;\n\n");
+
+ blobBatchCount = 1;
+ }
+ }
+ else
+ {
+ /* Not a BLOB. If we have a BLOB batch open, close it. */
+ if (blobBatchCount > 0)
+ {
+ ahprintf(AH, "--\n");
+ ahprintf(AH, "-- End BLOB restore batch\n");
+ ahprintf(AH, "--\n");
+ ahprintf(AH, "COMMIT;\n\n");
+
+ blobBatchCount = 0;
+ }
+ }
+
/* Select owner, schema, tablespace and default AM as necessary */
_becomeOwner(AH, te);
_selectOutputSchema(AH, te->namespace);
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index f8bec3f..f153f08 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -165,12 +165,20 @@ static void guessConstraintInheritance(TableInfo *tblinfo, int numTables);
static void dumpComment(Archive *fout, const char *type, const char *name,
const char *namespace, const char *owner,
CatalogId catalogId, int subid, DumpId dumpId);
+static bool dumpCommentQuery(Archive *fout, PQExpBuffer query, PQExpBuffer tag,
+ const char *type, const char *name,
+ const char *namespace, const char *owner,
+ CatalogId catalogId, int subid, DumpId dumpId);
static int findComments(Archive *fout, Oid classoid, Oid objoid,
CommentItem **items);
static int collectComments(Archive *fout, CommentItem **items);
static void dumpSecLabel(Archive *fout, const char *type, const char *name,
const char *namespace, const char *owner,
CatalogId catalogId, int subid, DumpId dumpId);
+static bool dumpSecLabelQuery(Archive *fout, PQExpBuffer query, PQExpBuffer tag,
+ const char *type, const char *name,
+ const char *namespace, const char *owner,
+ CatalogId catalogId, int subid, DumpId dumpId);
static int findSecLabels(Archive *fout, Oid classoid, Oid objoid,
SecLabelItem **items);
static int collectSecLabels(Archive *fout, SecLabelItem **items);
@@ -227,6 +235,13 @@ static DumpId dumpACL(Archive *fout, DumpId objDumpId, DumpId altDumpId,
const char *nspname, const char *owner,
const char *acls, const char *racls,
const char *initacls, const char *initracls);
+static bool dumpACLQuery(Archive *fout, PQExpBuffer query, PQExpBuffer tag,
+ DumpId objDumpId, DumpId altDumpId,
+ const char *type, const char *name,
+ const char *subname,
+ const char *nspname, const char *owner,
+ const char *acls, const char *racls,
+ const char *initacls, const char *initracls);
static void getDependencies(Archive *fout);
static void BuildArchiveDependencies(Archive *fout);
@@ -3468,11 +3483,44 @@ dumpBlob(Archive *fout, const BlobInfo *binfo)
{
PQExpBuffer cquery = createPQExpBuffer();
PQExpBuffer dquery = createPQExpBuffer();
+ PQExpBuffer tag = createPQExpBuffer();
+ teSection section = SECTION_PRE_DATA;
appendPQExpBuffer(cquery,
"SELECT pg_catalog.lo_create('%s');\n",
binfo->dobj.name);
+ /*
+ * In binary upgrade mode we put all the queries to restore
+ * one large object into a single TOC entry and emit it as
+ * SECTION_DATA so that they can be restored in parallel.
+ */
+ if (fout->dopt->binary_upgrade)
+ {
+ section = SECTION_DATA;
+
+ /* Dump comment if any */
+ if (binfo->dobj.dump & DUMP_COMPONENT_COMMENT)
+ dumpCommentQuery(fout, cquery, tag, "LARGE OBJECT",
+ binfo->dobj.name, NULL, binfo->rolname,
+ binfo->dobj.catId, 0, binfo->dobj.dumpId);
+
+ /* Dump security label if any */
+ if (binfo->dobj.dump & DUMP_COMPONENT_SECLABEL)
+ dumpSecLabelQuery(fout, cquery, tag, "LARGE OBJECT",
+ binfo->dobj.name,
+ NULL, binfo->rolname,
+ binfo->dobj.catId, 0, binfo->dobj.dumpId);
+
+ /* Dump ACL if any */
+ if (binfo->blobacl && (binfo->dobj.dump & DUMP_COMPONENT_ACL))
+ dumpACLQuery(fout, cquery, tag,
+ binfo->dobj.dumpId, InvalidDumpId, "LARGE OBJECT",
+ binfo->dobj.name, NULL,
+ NULL, binfo->rolname, binfo->blobacl, binfo->rblobacl,
+ binfo->initblobacl, binfo->initrblobacl);
+ }
+
appendPQExpBuffer(dquery,
"SELECT pg_catalog.lo_unlink('%s');\n",
binfo->dobj.name);
@@ -3482,28 +3530,31 @@ dumpBlob(Archive *fout, const BlobInfo *binfo)
ARCHIVE_OPTS(.tag = binfo->dobj.name,
.owner = binfo->rolname,
.description = "BLOB",
- .section = SECTION_PRE_DATA,
+ .section = section,
.createStmt = cquery->data,
.dropStmt = dquery->data));
- /* Dump comment if any */
- if (binfo->dobj.dump & DUMP_COMPONENT_COMMENT)
- dumpComment(fout, "LARGE OBJECT", binfo->dobj.name,
- NULL, binfo->rolname,
- binfo->dobj.catId, 0, binfo->dobj.dumpId);
-
- /* Dump security label if any */
- if (binfo->dobj.dump & DUMP_COMPONENT_SECLABEL)
- dumpSecLabel(fout, "LARGE OBJECT", binfo->dobj.name,
- NULL, binfo->rolname,
- binfo->dobj.catId, 0, binfo->dobj.dumpId);
-
- /* Dump ACL if any */
- if (binfo->blobacl && (binfo->dobj.dump & DUMP_COMPONENT_ACL))
- dumpACL(fout, binfo->dobj.dumpId, InvalidDumpId, "LARGE OBJECT",
- binfo->dobj.name, NULL,
- NULL, binfo->rolname, binfo->blobacl, binfo->rblobacl,
- binfo->initblobacl, binfo->initrblobacl);
+ if (!fout->dopt->binary_upgrade)
+ {
+ /* Dump comment if any */
+ if (binfo->dobj.dump & DUMP_COMPONENT_COMMENT)
+ dumpComment(fout, "LARGE OBJECT", binfo->dobj.name,
+ NULL, binfo->rolname,
+ binfo->dobj.catId, 0, binfo->dobj.dumpId);
+
+ /* Dump security label if any */
+ if (binfo->dobj.dump & DUMP_COMPONENT_SECLABEL)
+ dumpSecLabel(fout, "LARGE OBJECT", binfo->dobj.name,
+ NULL, binfo->rolname,
+ binfo->dobj.catId, 0, binfo->dobj.dumpId);
+
+ /* Dump ACL if any */
+ if (binfo->blobacl && (binfo->dobj.dump & DUMP_COMPONENT_ACL))
+ dumpACL(fout, binfo->dobj.dumpId, InvalidDumpId, "LARGE OBJECT",
+ binfo->dobj.name, NULL,
+ NULL, binfo->rolname, binfo->blobacl, binfo->rblobacl,
+ binfo->initblobacl, binfo->initrblobacl);
+ }
destroyPQExpBuffer(cquery);
destroyPQExpBuffer(dquery);
@@ -9868,25 +9919,56 @@ dumpComment(Archive *fout, const char *type, const char *name,
const char *namespace, const char *owner,
CatalogId catalogId, int subid, DumpId dumpId)
{
+ PQExpBuffer query = createPQExpBuffer();
+ PQExpBuffer tag = createPQExpBuffer();
+
+ if (dumpCommentQuery(fout, query, tag, type, name, namespace, owner,
+ catalogId, subid, dumpId))
+ {
+ /*
+ * We mark comments as SECTION_NONE because they really belong in the
+ * same section as their parent, whether that is pre-data or
+ * post-data.
+ */
+ ArchiveEntry(fout, nilCatalogId, createDumpId(),
+ ARCHIVE_OPTS(.tag = tag->data,
+ .namespace = namespace,
+ .owner = owner,
+ .description = "COMMENT",
+ .section = SECTION_NONE,
+ .createStmt = query->data,
+ .deps = &dumpId,
+ .nDeps = 1));
+ }
+ destroyPQExpBuffer(query);
+ destroyPQExpBuffer(tag);
+}
+
+static bool
+dumpCommentQuery(Archive *fout, PQExpBuffer query, PQExpBuffer tag,
+ const char *type, const char *name,
+ const char *namespace, const char *owner,
+ CatalogId catalogId, int subid, DumpId dumpId)
+{
DumpOptions *dopt = fout->dopt;
CommentItem *comments;
int ncomments;
/* do nothing, if --no-comments is supplied */
if (dopt->no_comments)
- return;
+ return false;
/* Comments are schema not data ... except blob comments are data */
if (strcmp(type, "LARGE OBJECT") != 0)
{
if (dopt->dataOnly)
- return;
+ return false;
}
else
{
/* We do dump blob comments in binary-upgrade mode */
if (dopt->schemaOnly && !dopt->binary_upgrade)
- return;
+ return false;
}
/* Search for comments associated with catalogId, using table */
@@ -9905,9 +9987,6 @@ dumpComment(Archive *fout, const char *type, const char *name,
/* If a comment exists, build COMMENT ON statement */
if (ncomments > 0)
{
- PQExpBuffer query = createPQExpBuffer();
- PQExpBuffer tag = createPQExpBuffer();
-
appendPQExpBuffer(query, "COMMENT ON %s ", type);
if (namespace && *namespace)
appendPQExpBuffer(query, "%s.", fmtId(namespace));
@@ -9917,24 +9996,10 @@ dumpComment(Archive *fout, const char *type, const char *name,
appendPQExpBuffer(tag, "%s %s", type, name);
- /*
- * We mark comments as SECTION_NONE because they really belong in the
- * same section as their parent, whether that is pre-data or
- * post-data.
- */
- ArchiveEntry(fout, nilCatalogId, createDumpId(),
- ARCHIVE_OPTS(.tag = tag->data,
- .namespace = namespace,
- .owner = owner,
- .description = "COMMENT",
- .section = SECTION_NONE,
- .createStmt = query->data,
- .deps = &dumpId,
- .nDeps = 1));
-
- destroyPQExpBuffer(query);
- destroyPQExpBuffer(tag);
+ return true;
}
+
+ return false;
}
/*
@@ -15070,18 +15135,63 @@ dumpACL(Archive *fout, DumpId objDumpId, DumpId altDumpId,
const char *initacls, const char *initracls)
{
DumpId aclDumpId = InvalidDumpId;
+ PQExpBuffer query = createPQExpBuffer();
+ PQExpBuffer tag = createPQExpBuffer();
+
+ if (dumpACLQuery(fout, query, tag, objDumpId, altDumpId,
+ type, name, subname, nspname, owner,
+ acls, racls, initacls, initracls))
+ {
+ DumpId aclDeps[2];
+ int nDeps = 0;
+
+ if (subname)
+ appendPQExpBuffer(tag, "COLUMN %s.%s", name, subname);
+ else
+ appendPQExpBuffer(tag, "%s %s", type, name);
+
+ aclDeps[nDeps++] = objDumpId;
+ if (altDumpId != InvalidDumpId)
+ aclDeps[nDeps++] = altDumpId;
+
+ aclDumpId = createDumpId();
+
+ ArchiveEntry(fout, nilCatalogId, aclDumpId,
+ ARCHIVE_OPTS(.tag = tag->data,
+ .namespace = nspname,
+ .owner = owner,
+ .description = "ACL",
+ .section = SECTION_NONE,
+ .createStmt = query->data,
+ .deps = aclDeps,
+ .nDeps = nDeps));
+
+ }
+
+ destroyPQExpBuffer(query);
+ destroyPQExpBuffer(tag);
+
+ return aclDumpId;
+}
+
+static bool
+dumpACLQuery(Archive *fout, PQExpBuffer query, PQExpBuffer tag,
+ DumpId objDumpId, DumpId altDumpId,
+ const char *type, const char *name, const char *subname,
+ const char *nspname, const char *owner,
+ const char *acls, const char *racls,
+ const char *initacls, const char *initracls)
+{
DumpOptions *dopt = fout->dopt;
- PQExpBuffer sql;
+ bool haveACL = false;
/* Do nothing if ACL dump is not enabled */
if (dopt->aclsSkip)
- return InvalidDumpId;
+ return false;
/* --data-only skips ACLs *except* BLOB ACLs */
if (dopt->dataOnly && strcmp(type, "LARGE OBJECT") != 0)
- return InvalidDumpId;
-
- sql = createPQExpBuffer();
+ return false;
/*
* Check to see if this object has had any initial ACLs included for it.
@@ -15093,54 +15203,31 @@ dumpACL(Archive *fout, DumpId objDumpId, DumpId altDumpId,
*/
if (strlen(initacls) != 0 || strlen(initracls) != 0)
{
- appendPQExpBufferStr(sql, "SELECT pg_catalog.binary_upgrade_set_record_init_privs(true);\n");
+ haveACL = true;
+ appendPQExpBufferStr(query, "SELECT pg_catalog.binary_upgrade_set_record_init_privs(true);\n");
if (!buildACLCommands(name, subname, nspname, type,
initacls, initracls, owner,
- "", fout->remoteVersion, sql))
+ "", fout->remoteVersion, query))
fatal("could not parse initial GRANT ACL list (%s) or initial REVOKE ACL list (%s) for object \"%s\" (%s)",
initacls, initracls, name, type);
- appendPQExpBufferStr(sql, "SELECT pg_catalog.binary_upgrade_set_record_init_privs(false);\n");
+ appendPQExpBufferStr(query, "SELECT pg_catalog.binary_upgrade_set_record_init_privs(false);\n");
}
if (!buildACLCommands(name, subname, nspname, type,
acls, racls, owner,
- "", fout->remoteVersion, sql))
+ "", fout->remoteVersion, query))
fatal("could not parse GRANT ACL list (%s) or REVOKE ACL list (%s) for object \"%s\" (%s)",
acls, racls, name, type);
- if (sql->len > 0)
+ if (haveACL && tag != NULL)
{
- PQExpBuffer tag = createPQExpBuffer();
- DumpId aclDeps[2];
- int nDeps = 0;
-
if (subname)
appendPQExpBuffer(tag, "COLUMN %s.%s", name, subname);
else
appendPQExpBuffer(tag, "%s %s", type, name);
-
- aclDeps[nDeps++] = objDumpId;
- if (altDumpId != InvalidDumpId)
- aclDeps[nDeps++] = altDumpId;
-
- aclDumpId = createDumpId();
-
- ArchiveEntry(fout, nilCatalogId, aclDumpId,
- ARCHIVE_OPTS(.tag = tag->data,
- .namespace = nspname,
- .owner = owner,
- .description = "ACL",
- .section = SECTION_NONE,
- .createStmt = sql->data,
- .deps = aclDeps,
- .nDeps = nDeps));
-
- destroyPQExpBuffer(tag);
}
- destroyPQExpBuffer(sql);
-
- return aclDumpId;
+ return haveACL;
}
/*
@@ -15166,34 +15253,58 @@ dumpSecLabel(Archive *fout, const char *type, const char *name,
const char *namespace, const char *owner,
CatalogId catalogId, int subid, DumpId dumpId)
{
+ PQExpBuffer query = createPQExpBuffer();
+ PQExpBuffer tag = createPQExpBuffer();
+
+ if (dumpSecLabelQuery(fout, query, tag, type, name,
+ namespace, owner, catalogId, subid, dumpId))
+ {
+ ArchiveEntry(fout, nilCatalogId, createDumpId(),
+ ARCHIVE_OPTS(.tag = tag->data,
+ .namespace = namespace,
+ .owner = owner,
+ .description = "SECURITY LABEL",
+ .section = SECTION_NONE,
+ .createStmt = query->data,
+ .deps = &dumpId,
+ .nDeps = 1));
+ }
+
+ destroyPQExpBuffer(query);
+ destroyPQExpBuffer(tag);
+}
+
+static bool
+dumpSecLabelQuery(Archive *fout, PQExpBuffer query, PQExpBuffer tag,
+ const char *type, const char *name,
+ const char *namespace, const char *owner,
+ CatalogId catalogId, int subid, DumpId dumpId)
+{
DumpOptions *dopt = fout->dopt;
SecLabelItem *labels;
int nlabels;
int i;
- PQExpBuffer query;
/* do nothing, if --no-security-labels is supplied */
if (dopt->no_security_labels)
- return;
+ return false;
/* Security labels are schema not data ... except blob labels are data */
if (strcmp(type, "LARGE OBJECT") != 0)
{
if (dopt->dataOnly)
- return;
+ return false;
}
else
{
/* We do dump blob security labels in binary-upgrade mode */
if (dopt->schemaOnly && !dopt->binary_upgrade)
- return;
+ return false;
}
/* Search for security labels associated with catalogId, using table */
nlabels = findSecLabels(fout, catalogId.tableoid, catalogId.oid, &labels);
- query = createPQExpBuffer();
-
for (i = 0; i < nlabels; i++)
{
/*
@@ -15214,22 +15325,11 @@ dumpSecLabel(Archive *fout, const char *type, const char *name,
if (query->len > 0)
{
- PQExpBuffer tag = createPQExpBuffer();
-
appendPQExpBuffer(tag, "%s %s", type, name);
- ArchiveEntry(fout, nilCatalogId, createDumpId(),
- ARCHIVE_OPTS(.tag = tag->data,
- .namespace = namespace,
- .owner = owner,
- .description = "SECURITY LABEL",
- .section = SECTION_NONE,
- .createStmt = query->data,
- .deps = &dumpId,
- .nDeps = 1));
- destroyPQExpBuffer(tag);
+ return true;
}
- destroyPQExpBuffer(query);
+ return false;
}
/*
diff --git a/src/bin/pg_dump/pg_restore.c b/src/bin/pg_dump/pg_restore.c
index 589b4ae..b16db03 100644
--- a/src/bin/pg_dump/pg_restore.c
+++ b/src/bin/pg_dump/pg_restore.c
@@ -59,6 +59,7 @@ main(int argc, char **argv)
int c;
int exit_code;
int numWorkers = 1;
+ int blobBatchSize = 0;
Archive *AH;
char *inputFileSpec;
static int disable_triggers = 0;
@@ -120,6 +121,7 @@ main(int argc, char **argv)
{"no-publications", no_argument, &no_publications, 1},
{"no-security-labels", no_argument, &no_security_labels, 1},
{"no-subscriptions", no_argument, &no_subscriptions, 1},
+ {"restore-blob-batch-size", required_argument, NULL, 4},
{NULL, 0, NULL, 0}
};
@@ -280,6 +282,10 @@ main(int argc, char **argv)
set_dump_section(optarg, &(opts->dumpSections));
break;
+ case 4: /* # of blobs to restore per transaction */
+ blobBatchSize = atoi(optarg);
+ break;
+
default:
fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
exit_nicely(1);
@@ -434,6 +440,7 @@ main(int argc, char **argv)
SortTocFromFile(AH);
AH->numWorkers = numWorkers;
+ AH->blobBatchSize = blobBatchSize;
if (opts->tocSummary)
PrintTOCSummary(AH);
@@ -506,6 +513,8 @@ usage(const char *progname)
printf(_(" --use-set-session-authorization\n"
" use SET SESSION AUTHORIZATION commands instead of\n"
" ALTER OWNER commands to set ownership\n"));
+ printf(_(" --restore-blob-batch-size=NUM\n"
+ " attempt to restore NUM large objects per transaction\n"));
printf(_("\nConnection options:\n"));
printf(_(" -h, --host=HOSTNAME database server host or socket directory\n"));
diff --git a/src/bin/pg_upgrade/option.c b/src/bin/pg_upgrade/option.c
index 9c9b313..868e9f6 100644
--- a/src/bin/pg_upgrade/option.c
+++ b/src/bin/pg_upgrade/option.c
@@ -57,6 +57,8 @@ parseCommandLine(int argc, char *argv[])
{"verbose", no_argument, NULL, 'v'},
{"clone", no_argument, NULL, 1},
{"index-collation-versions-unknown", no_argument, NULL, 2},
+ {"restore-jobs", required_argument, NULL, 3},
+ {"restore-blob-batch-size", required_argument, NULL, 4},
{NULL, 0, NULL, 0}
};
@@ -208,6 +210,14 @@ parseCommandLine(int argc, char *argv[])
user_opts.ind_coll_unknown = true;
break;
+ case 3:
+ user_opts.restore_jobs = atoi(optarg);
+ break;
+
+ case 4:
+ user_opts.blob_batch_size = atoi(optarg);
+ break;
+
default:
fprintf(stderr, _("Try \"%s --help\" for more information.\n"),
os_info.progname);
@@ -314,6 +324,8 @@ usage(void)
printf(_(" --clone clone instead of copying files to new cluster\n"));
printf(_(" --index-collation-versions-unknown\n"));
printf(_(" mark text indexes as needing to be rebuilt\n"));
+ printf(_(" --restore-blob-batch-size=NUM attempt to restore NUM large objects per\n"));
+ printf(_(" transaction\n"));
printf(_(" -?, --help show this help, then exit\n"));
printf(_("\n"
"Before running pg_upgrade you must:\n"
diff --git a/src/bin/pg_upgrade/pg_upgrade.c b/src/bin/pg_upgrade/pg_upgrade.c
index e23b8ca..095e980 100644
--- a/src/bin/pg_upgrade/pg_upgrade.c
+++ b/src/bin/pg_upgrade/pg_upgrade.c
@@ -385,10 +385,13 @@ create_new_objects(void)
parallel_exec_prog(log_file_name,
NULL,
"\"%s/pg_restore\" %s %s --exit-on-error --verbose "
+ "--jobs %d --restore-blob-batch-size %d "
"--dbname template1 \"%s\"",
new_cluster.bindir,
cluster_conn_opts(&new_cluster),
create_opts,
+ user_opts.restore_jobs,
+ user_opts.blob_batch_size,
sql_file_name);
}
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index 919a784..5647f96 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -291,6 +291,8 @@ typedef struct
* changes */
transferMode transfer_mode; /* copy files or link them? */
int jobs; /* number of processes/threads to use */
+ int restore_jobs; /* number of pg_restore --jobs to use */
+ int blob_batch_size; /* number of blobs to restore per xact */
char *socketdir; /* directory to use for Unix sockets */
bool ind_coll_unknown; /* mark unknown index collation versions */
} UserOpts;
^ permalink raw reply [nested|flat] 43+ messages in thread
* Re: pg_upgrade failing for 200+ million Large Objects
2021-03-08 11:02 Re: pg_upgrade failing for 200+ million Large Objects Tharakan, Robins <[email protected]>
2021-03-08 12:33 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 14:13 ` Re: pg_upgrade failing for 200+ million Large Objects Robins Tharakan <[email protected]>
2021-03-08 16:33 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-08 16:35 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 16:58 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 04:39 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-20 15:23 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 16:55 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-21 11:47 ` Re: pg_upgrade failing for 200+ million Large Objects Andrew Dunstan <[email protected]>
2021-03-21 16:56 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-21 18:18 ` Re: pg_upgrade failing for 200+ million Large Objects Andrew Dunstan <[email protected]>
2021-03-22 21:36 ` Re: pg_upgrade failing for 200+ million Large Objects Zhihong Yu <[email protected]>
2021-03-22 23:18 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 12:51 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
@ 2021-03-23 14:56 ` Bruce Momjian <[email protected]>
2021-03-23 17:25 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
0 siblings, 1 reply; 43+ messages in thread
From: Bruce Momjian @ 2021-03-23 14:56 UTC (permalink / raw)
To: Jan Wieck <[email protected]>; +Cc: Zhihong Yu <[email protected]>; Andrew Dunstan <[email protected]>; Tom Lane <[email protected]>; Magnus Hagander <[email protected]>; Robins Tharakan <[email protected]>; Peter Eisentraut <[email protected]>; pgsql-hackers
On Tue, Mar 23, 2021 at 08:51:32AM -0400, Jan Wieck wrote:
> On 3/22/21 7:18 PM, Jan Wieck wrote:
> > On 3/22/21 5:36 PM, Zhihong Yu wrote:
> > > Hi,
> > >
> > > w.r.t. pg_upgrade_improvements.v2.diff.
> > >
> > > + blobBatchCount = 0;
> > > + blobInXact = false;
> > >
> > > The count and bool flag are always reset in tandem. It seems
> > > variable blobInXact is not needed.
> >
> > You are right. I will fix that.
>
> New patch v3 attached.
Would it be better to allow pg_upgrade to pass arbitrary arguments to
pg_restore, instead of just these specific ones?
--
Bruce Momjian <[email protected]> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
^ permalink raw reply [nested|flat] 43+ messages in thread
* Re: pg_upgrade failing for 200+ million Large Objects
2021-03-08 11:02 Re: pg_upgrade failing for 200+ million Large Objects Tharakan, Robins <[email protected]>
2021-03-08 12:33 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 14:13 ` Re: pg_upgrade failing for 200+ million Large Objects Robins Tharakan <[email protected]>
2021-03-08 16:33 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-08 16:35 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 16:58 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 04:39 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-20 15:23 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 16:55 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-21 11:47 ` Re: pg_upgrade failing for 200+ million Large Objects Andrew Dunstan <[email protected]>
2021-03-21 16:56 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-21 18:18 ` Re: pg_upgrade failing for 200+ million Large Objects Andrew Dunstan <[email protected]>
2021-03-22 21:36 ` Re: pg_upgrade failing for 200+ million Large Objects Zhihong Yu <[email protected]>
2021-03-22 23:18 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 12:51 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 14:56 ` Re: pg_upgrade failing for 200+ million Large Objects Bruce Momjian <[email protected]>
@ 2021-03-23 17:25 ` Jan Wieck <[email protected]>
2021-03-23 18:06 ` Re: pg_upgrade failing for 200+ million Large Objects Bruce Momjian <[email protected]>
0 siblings, 1 reply; 43+ messages in thread
From: Jan Wieck @ 2021-03-23 17:25 UTC (permalink / raw)
To: Bruce Momjian <[email protected]>; +Cc: Zhihong Yu <[email protected]>; Andrew Dunstan <[email protected]>; Tom Lane <[email protected]>; Magnus Hagander <[email protected]>; Robins Tharakan <[email protected]>; Peter Eisentraut <[email protected]>; pgsql-hackers
On 3/23/21 10:56 AM, Bruce Momjian wrote:
> On Tue, Mar 23, 2021 at 08:51:32AM -0400, Jan Wieck wrote:
>> On 3/22/21 7:18 PM, Jan Wieck wrote:
>> > On 3/22/21 5:36 PM, Zhihong Yu wrote:
>> > > Hi,
>> > >
>> > > w.r.t. pg_upgrade_improvements.v2.diff.
>> > >
>> > > + blobBatchCount = 0;
>> > > + blobInXact = false;
>> > >
>> > > The count and bool flag are always reset in tandem. It seems
>> > > variable blobInXact is not needed.
>> >
>> > You are right. I will fix that.
>>
>> New patch v3 attached.
>
> Would it be better to allow pg_upgrade to pass arbitrary arguments to
> pg_restore, instead of just these specific ones?
>
That would mean arbitrary parameters to pg_dump as well as pg_restore.
But yes, that would probably be better in the long run.
Any suggestion as to how that would actually look like? Unfortunately
pg_restore has -[dDoOr] already used, so it doesn't look like there will
be any naturally intelligible short options for that.
Regards, Jan
--
Jan Wieck
Principle Database Engineer
Amazon Web Services
^ permalink raw reply [nested|flat] 43+ messages in thread
* Re: pg_upgrade failing for 200+ million Large Objects
2021-03-08 11:02 Re: pg_upgrade failing for 200+ million Large Objects Tharakan, Robins <[email protected]>
2021-03-08 12:33 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 14:13 ` Re: pg_upgrade failing for 200+ million Large Objects Robins Tharakan <[email protected]>
2021-03-08 16:33 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-08 16:35 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 16:58 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 04:39 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-20 15:23 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 16:55 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-21 11:47 ` Re: pg_upgrade failing for 200+ million Large Objects Andrew Dunstan <[email protected]>
2021-03-21 16:56 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-21 18:18 ` Re: pg_upgrade failing for 200+ million Large Objects Andrew Dunstan <[email protected]>
2021-03-22 21:36 ` Re: pg_upgrade failing for 200+ million Large Objects Zhihong Yu <[email protected]>
2021-03-22 23:18 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 12:51 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 14:56 ` Re: pg_upgrade failing for 200+ million Large Objects Bruce Momjian <[email protected]>
2021-03-23 17:25 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
@ 2021-03-23 18:06 ` Bruce Momjian <[email protected]>
2021-03-23 18:23 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
0 siblings, 1 reply; 43+ messages in thread
From: Bruce Momjian @ 2021-03-23 18:06 UTC (permalink / raw)
To: Jan Wieck <[email protected]>; +Cc: Zhihong Yu <[email protected]>; Andrew Dunstan <[email protected]>; Tom Lane <[email protected]>; Magnus Hagander <[email protected]>; Robins Tharakan <[email protected]>; Peter Eisentraut <[email protected]>; pgsql-hackers
On Tue, Mar 23, 2021 at 01:25:15PM -0400, Jan Wieck wrote:
> On 3/23/21 10:56 AM, Bruce Momjian wrote:
> > Would it be better to allow pg_upgrade to pass arbitrary arguments to
> > pg_restore, instead of just these specific ones?
> >
>
> That would mean arbitrary parameters to pg_dump as well as pg_restore. But
> yes, that would probably be better in the long run.
>
> Any suggestion as to how that would actually look like? Unfortunately
> pg_restore has -[dDoOr] already used, so it doesn't look like there will be
> any naturally intelligible short options for that.
We have the postmaster which can pass arbitrary arguments to postgres
processes using -o.
--
Bruce Momjian <[email protected]> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
^ permalink raw reply [nested|flat] 43+ messages in thread
* Re: pg_upgrade failing for 200+ million Large Objects
2021-03-08 11:02 Re: pg_upgrade failing for 200+ million Large Objects Tharakan, Robins <[email protected]>
2021-03-08 12:33 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 14:13 ` Re: pg_upgrade failing for 200+ million Large Objects Robins Tharakan <[email protected]>
2021-03-08 16:33 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-08 16:35 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 16:58 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 04:39 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-20 15:23 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 16:55 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-21 11:47 ` Re: pg_upgrade failing for 200+ million Large Objects Andrew Dunstan <[email protected]>
2021-03-21 16:56 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-21 18:18 ` Re: pg_upgrade failing for 200+ million Large Objects Andrew Dunstan <[email protected]>
2021-03-22 21:36 ` Re: pg_upgrade failing for 200+ million Large Objects Zhihong Yu <[email protected]>
2021-03-22 23:18 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 12:51 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 14:56 ` Re: pg_upgrade failing for 200+ million Large Objects Bruce Momjian <[email protected]>
2021-03-23 17:25 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 18:06 ` Re: pg_upgrade failing for 200+ million Large Objects Bruce Momjian <[email protected]>
@ 2021-03-23 18:23 ` Jan Wieck <[email protected]>
2021-03-23 18:25 ` Re: pg_upgrade failing for 200+ million Large Objects Bruce Momjian <[email protected]>
2021-03-23 18:35 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
0 siblings, 2 replies; 43+ messages in thread
From: Jan Wieck @ 2021-03-23 18:23 UTC (permalink / raw)
To: Bruce Momjian <[email protected]>; +Cc: Zhihong Yu <[email protected]>; Andrew Dunstan <[email protected]>; Tom Lane <[email protected]>; Magnus Hagander <[email protected]>; Robins Tharakan <[email protected]>; Peter Eisentraut <[email protected]>; pgsql-hackers
On 3/23/21 2:06 PM, Bruce Momjian wrote:
> We have the postmaster which can pass arbitrary arguments to postgres
> processes using -o.
Right, and -o is already taken in pg_upgrade for sending options to the
old postmaster.
What we are looking for are options for sending options to pg_dump and
pg_restore, which are not postmasters or children of postmaster, but
rather clients. There is no option to send options to clients of
postmasters.
So the question remains, how do we name this?
--pg-dump-options "<string>"
--pg-restore-options "<string>"
where "<string>" could be something like "--whatever[=NUM] [...]" would
be something unambiguous.
Regards, Jan
--
Jan Wieck
Principle Database Engineer
Amazon Web Services
^ permalink raw reply [nested|flat] 43+ messages in thread
* Re: pg_upgrade failing for 200+ million Large Objects
2021-03-08 11:02 Re: pg_upgrade failing for 200+ million Large Objects Tharakan, Robins <[email protected]>
2021-03-08 12:33 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 14:13 ` Re: pg_upgrade failing for 200+ million Large Objects Robins Tharakan <[email protected]>
2021-03-08 16:33 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-08 16:35 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 16:58 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 04:39 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-20 15:23 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 16:55 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-21 11:47 ` Re: pg_upgrade failing for 200+ million Large Objects Andrew Dunstan <[email protected]>
2021-03-21 16:56 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-21 18:18 ` Re: pg_upgrade failing for 200+ million Large Objects Andrew Dunstan <[email protected]>
2021-03-22 21:36 ` Re: pg_upgrade failing for 200+ million Large Objects Zhihong Yu <[email protected]>
2021-03-22 23:18 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 12:51 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 14:56 ` Re: pg_upgrade failing for 200+ million Large Objects Bruce Momjian <[email protected]>
2021-03-23 17:25 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 18:06 ` Re: pg_upgrade failing for 200+ million Large Objects Bruce Momjian <[email protected]>
2021-03-23 18:23 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
@ 2021-03-23 18:25 ` Bruce Momjian <[email protected]>
1 sibling, 0 replies; 43+ messages in thread
From: Bruce Momjian @ 2021-03-23 18:25 UTC (permalink / raw)
To: Jan Wieck <[email protected]>; +Cc: Zhihong Yu <[email protected]>; Andrew Dunstan <[email protected]>; Tom Lane <[email protected]>; Magnus Hagander <[email protected]>; Robins Tharakan <[email protected]>; Peter Eisentraut <[email protected]>; pgsql-hackers
On Tue, Mar 23, 2021 at 02:23:03PM -0400, Jan Wieck wrote:
> On 3/23/21 2:06 PM, Bruce Momjian wrote:
> > We have the postmaster which can pass arbitrary arguments to postgres
> > processes using -o.
>
> Right, and -o is already taken in pg_upgrade for sending options to the old
> postmaster.
>
> What we are looking for are options for sending options to pg_dump and
> pg_restore, which are not postmasters or children of postmaster, but rather
> clients. There is no option to send options to clients of postmasters.
>
> So the question remains, how do we name this?
>
> --pg-dump-options "<string>"
> --pg-restore-options "<string>"
>
> where "<string>" could be something like "--whatever[=NUM] [...]" would be
> something unambiguous.
Sure. I don't think the letter you use is a problem.
--
Bruce Momjian <[email protected]> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
^ permalink raw reply [nested|flat] 43+ messages in thread
* Re: pg_upgrade failing for 200+ million Large Objects
2021-03-08 11:02 Re: pg_upgrade failing for 200+ million Large Objects Tharakan, Robins <[email protected]>
2021-03-08 12:33 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 14:13 ` Re: pg_upgrade failing for 200+ million Large Objects Robins Tharakan <[email protected]>
2021-03-08 16:33 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-08 16:35 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 16:58 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 04:39 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-20 15:23 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 16:55 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-21 11:47 ` Re: pg_upgrade failing for 200+ million Large Objects Andrew Dunstan <[email protected]>
2021-03-21 16:56 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-21 18:18 ` Re: pg_upgrade failing for 200+ million Large Objects Andrew Dunstan <[email protected]>
2021-03-22 21:36 ` Re: pg_upgrade failing for 200+ million Large Objects Zhihong Yu <[email protected]>
2021-03-22 23:18 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 12:51 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 14:56 ` Re: pg_upgrade failing for 200+ million Large Objects Bruce Momjian <[email protected]>
2021-03-23 17:25 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 18:06 ` Re: pg_upgrade failing for 200+ million Large Objects Bruce Momjian <[email protected]>
2021-03-23 18:23 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
@ 2021-03-23 18:35 ` Tom Lane <[email protected]>
2021-03-23 18:54 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
1 sibling, 1 reply; 43+ messages in thread
From: Tom Lane @ 2021-03-23 18:35 UTC (permalink / raw)
To: Jan Wieck <[email protected]>; +Cc: Bruce Momjian <[email protected]>; Zhihong Yu <[email protected]>; Andrew Dunstan <[email protected]>; Magnus Hagander <[email protected]>; Robins Tharakan <[email protected]>; Peter Eisentraut <[email protected]>; pgsql-hackers
Jan Wieck <[email protected]> writes:
> So the question remains, how do we name this?
> --pg-dump-options "<string>"
> --pg-restore-options "<string>"
If you're passing multiple options, that is
--pg-dump-options "--foo=x --bar=y"
it seems just horribly fragile. Lose the double quotes and suddenly
--bar is a separate option to pg_upgrade itself, not part of the argument
for the previous option. That's pretty easy to do when passing things
through shell scripts, too. So it'd likely be safer to write
--pg-dump-option=--foo=x --pg-dump-option=--bar=y
which requires pg_upgrade to allow aggregating multiple options,
but you'd probably want it to act that way anyway.
regards, tom lane
^ permalink raw reply [nested|flat] 43+ messages in thread
* Re: pg_upgrade failing for 200+ million Large Objects
2021-03-08 11:02 Re: pg_upgrade failing for 200+ million Large Objects Tharakan, Robins <[email protected]>
2021-03-08 12:33 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 14:13 ` Re: pg_upgrade failing for 200+ million Large Objects Robins Tharakan <[email protected]>
2021-03-08 16:33 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-08 16:35 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 16:58 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 04:39 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-20 15:23 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 16:55 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-21 11:47 ` Re: pg_upgrade failing for 200+ million Large Objects Andrew Dunstan <[email protected]>
2021-03-21 16:56 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-21 18:18 ` Re: pg_upgrade failing for 200+ million Large Objects Andrew Dunstan <[email protected]>
2021-03-22 21:36 ` Re: pg_upgrade failing for 200+ million Large Objects Zhihong Yu <[email protected]>
2021-03-22 23:18 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 12:51 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 14:56 ` Re: pg_upgrade failing for 200+ million Large Objects Bruce Momjian <[email protected]>
2021-03-23 17:25 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 18:06 ` Re: pg_upgrade failing for 200+ million Large Objects Bruce Momjian <[email protected]>
2021-03-23 18:23 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 18:35 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
@ 2021-03-23 18:54 ` Jan Wieck <[email protected]>
2021-03-23 18:59 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
0 siblings, 1 reply; 43+ messages in thread
From: Jan Wieck @ 2021-03-23 18:54 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: Bruce Momjian <[email protected]>; Zhihong Yu <[email protected]>; Andrew Dunstan <[email protected]>; Magnus Hagander <[email protected]>; Robins Tharakan <[email protected]>; Peter Eisentraut <[email protected]>; pgsql-hackers
On 3/23/21 2:35 PM, Tom Lane wrote:
> Jan Wieck <[email protected]> writes:
>> So the question remains, how do we name this?
>
>> --pg-dump-options "<string>"
>> --pg-restore-options "<string>"
>
> If you're passing multiple options, that is
>
> --pg-dump-options "--foo=x --bar=y"
>
> it seems just horribly fragile. Lose the double quotes and suddenly
> --bar is a separate option to pg_upgrade itself, not part of the argument
> for the previous option. That's pretty easy to do when passing things
> through shell scripts, too. So it'd likely be safer to write
>
> --pg-dump-option=--foo=x --pg-dump-option=--bar=y
>
> which requires pg_upgrade to allow aggregating multiple options,
> but you'd probably want it to act that way anyway.
... which would be all really easy if pg_upgrade wouldn't be assembling
a shell script string to pass into parallel_exec_prog() by itself.
But I will see what I can do ...
Regards, Jan
--
Jan Wieck
Principle Database Engineer
Amazon Web Services
^ permalink raw reply [nested|flat] 43+ messages in thread
* Re: pg_upgrade failing for 200+ million Large Objects
2021-03-08 11:02 Re: pg_upgrade failing for 200+ million Large Objects Tharakan, Robins <[email protected]>
2021-03-08 12:33 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 14:13 ` Re: pg_upgrade failing for 200+ million Large Objects Robins Tharakan <[email protected]>
2021-03-08 16:33 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-08 16:35 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 16:58 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 04:39 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-20 15:23 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 16:55 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-21 11:47 ` Re: pg_upgrade failing for 200+ million Large Objects Andrew Dunstan <[email protected]>
2021-03-21 16:56 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-21 18:18 ` Re: pg_upgrade failing for 200+ million Large Objects Andrew Dunstan <[email protected]>
2021-03-22 21:36 ` Re: pg_upgrade failing for 200+ million Large Objects Zhihong Yu <[email protected]>
2021-03-22 23:18 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 12:51 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 14:56 ` Re: pg_upgrade failing for 200+ million Large Objects Bruce Momjian <[email protected]>
2021-03-23 17:25 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 18:06 ` Re: pg_upgrade failing for 200+ million Large Objects Bruce Momjian <[email protected]>
2021-03-23 18:23 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 18:35 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-23 18:54 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
@ 2021-03-23 18:59 ` Tom Lane <[email protected]>
2021-03-23 19:22 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
0 siblings, 1 reply; 43+ messages in thread
From: Tom Lane @ 2021-03-23 18:59 UTC (permalink / raw)
To: Jan Wieck <[email protected]>; +Cc: Bruce Momjian <[email protected]>; Zhihong Yu <[email protected]>; Andrew Dunstan <[email protected]>; Magnus Hagander <[email protected]>; Robins Tharakan <[email protected]>; Peter Eisentraut <[email protected]>; pgsql-hackers
Jan Wieck <[email protected]> writes:
> On 3/23/21 2:35 PM, Tom Lane wrote:
>> If you're passing multiple options, that is
>> --pg-dump-options "--foo=x --bar=y"
>> it seems just horribly fragile. Lose the double quotes and suddenly
>> --bar is a separate option to pg_upgrade itself, not part of the argument
>> for the previous option. That's pretty easy to do when passing things
>> through shell scripts, too.
> ... which would be all really easy if pg_upgrade wouldn't be assembling
> a shell script string to pass into parallel_exec_prog() by itself.
No, what I was worried about is shell script(s) that invoke pg_upgrade
and have to pass down some of these options through multiple levels of
option parsing.
BTW, it doesn't seem like the "pg-" prefix has any value-add here,
so maybe "--dump-option" and "--restore-option" would be suitable
spellings.
regards, tom lane
^ permalink raw reply [nested|flat] 43+ messages in thread
* Re: pg_upgrade failing for 200+ million Large Objects
2021-03-08 11:02 Re: pg_upgrade failing for 200+ million Large Objects Tharakan, Robins <[email protected]>
2021-03-08 12:33 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 14:13 ` Re: pg_upgrade failing for 200+ million Large Objects Robins Tharakan <[email protected]>
2021-03-08 16:33 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-08 16:35 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 16:58 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 04:39 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-20 15:23 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 16:55 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-21 11:47 ` Re: pg_upgrade failing for 200+ million Large Objects Andrew Dunstan <[email protected]>
2021-03-21 16:56 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-21 18:18 ` Re: pg_upgrade failing for 200+ million Large Objects Andrew Dunstan <[email protected]>
2021-03-22 21:36 ` Re: pg_upgrade failing for 200+ million Large Objects Zhihong Yu <[email protected]>
2021-03-22 23:18 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 12:51 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 14:56 ` Re: pg_upgrade failing for 200+ million Large Objects Bruce Momjian <[email protected]>
2021-03-23 17:25 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 18:06 ` Re: pg_upgrade failing for 200+ million Large Objects Bruce Momjian <[email protected]>
2021-03-23 18:23 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 18:35 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-23 18:54 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 18:59 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
@ 2021-03-23 19:22 ` Jan Wieck <[email protected]>
2021-03-23 19:35 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
0 siblings, 1 reply; 43+ messages in thread
From: Jan Wieck @ 2021-03-23 19:22 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: Bruce Momjian <[email protected]>; Zhihong Yu <[email protected]>; Andrew Dunstan <[email protected]>; Magnus Hagander <[email protected]>; Robins Tharakan <[email protected]>; Peter Eisentraut <[email protected]>; pgsql-hackers
On 3/23/21 2:59 PM, Tom Lane wrote:
> Jan Wieck <[email protected]> writes:
>> On 3/23/21 2:35 PM, Tom Lane wrote:
>>> If you're passing multiple options, that is
>>> --pg-dump-options "--foo=x --bar=y"
>>> it seems just horribly fragile. Lose the double quotes and suddenly
>>> --bar is a separate option to pg_upgrade itself, not part of the argument
>>> for the previous option. That's pretty easy to do when passing things
>>> through shell scripts, too.
>
>> ... which would be all really easy if pg_upgrade wouldn't be assembling
>> a shell script string to pass into parallel_exec_prog() by itself.
>
> No, what I was worried about is shell script(s) that invoke pg_upgrade
> and have to pass down some of these options through multiple levels of
> option parsing.
The problem here is that pg_upgrade itself is invoking a shell again. It
is not assembling an array of arguments to pass into exec*(). I'd be a
happy camper if it did the latter. But as things are we'd have to add
full shell escapeing for arbitrary strings.
>
> BTW, it doesn't seem like the "pg-" prefix has any value-add here,
> so maybe "--dump-option" and "--restore-option" would be suitable
> spellings.
Agreed.
Regards, Jan
--
Jan Wieck
Principle Database Engineer
Amazon Web Services
^ permalink raw reply [nested|flat] 43+ messages in thread
* Re: pg_upgrade failing for 200+ million Large Objects
2021-03-08 11:02 Re: pg_upgrade failing for 200+ million Large Objects Tharakan, Robins <[email protected]>
2021-03-08 12:33 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 14:13 ` Re: pg_upgrade failing for 200+ million Large Objects Robins Tharakan <[email protected]>
2021-03-08 16:33 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-08 16:35 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 16:58 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 04:39 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-20 15:23 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 16:55 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-21 11:47 ` Re: pg_upgrade failing for 200+ million Large Objects Andrew Dunstan <[email protected]>
2021-03-21 16:56 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-21 18:18 ` Re: pg_upgrade failing for 200+ million Large Objects Andrew Dunstan <[email protected]>
2021-03-22 21:36 ` Re: pg_upgrade failing for 200+ million Large Objects Zhihong Yu <[email protected]>
2021-03-22 23:18 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 12:51 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 14:56 ` Re: pg_upgrade failing for 200+ million Large Objects Bruce Momjian <[email protected]>
2021-03-23 17:25 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 18:06 ` Re: pg_upgrade failing for 200+ million Large Objects Bruce Momjian <[email protected]>
2021-03-23 18:23 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 18:35 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-23 18:54 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 18:59 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-23 19:22 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
@ 2021-03-23 19:35 ` Tom Lane <[email protected]>
2021-03-23 19:59 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
0 siblings, 1 reply; 43+ messages in thread
From: Tom Lane @ 2021-03-23 19:35 UTC (permalink / raw)
To: Jan Wieck <[email protected]>; +Cc: Bruce Momjian <[email protected]>; Zhihong Yu <[email protected]>; Andrew Dunstan <[email protected]>; Magnus Hagander <[email protected]>; Robins Tharakan <[email protected]>; Peter Eisentraut <[email protected]>; pgsql-hackers
Jan Wieck <[email protected]> writes:
> The problem here is that pg_upgrade itself is invoking a shell again. It
> is not assembling an array of arguments to pass into exec*(). I'd be a
> happy camper if it did the latter. But as things are we'd have to add
> full shell escapeing for arbitrary strings.
Surely we need that (and have it already) anyway?
I think we've stayed away from exec* because we'd have to write an
emulation for Windows. Maybe somebody will get fed up and produce
such code, but it's not likely to be the least-effort route to the
goal.
regards, tom lane
^ permalink raw reply [nested|flat] 43+ messages in thread
* Re: pg_upgrade failing for 200+ million Large Objects
2021-03-08 11:02 Re: pg_upgrade failing for 200+ million Large Objects Tharakan, Robins <[email protected]>
2021-03-08 12:33 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 14:13 ` Re: pg_upgrade failing for 200+ million Large Objects Robins Tharakan <[email protected]>
2021-03-08 16:33 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-08 16:35 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 16:58 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 04:39 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-20 15:23 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 16:55 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-21 11:47 ` Re: pg_upgrade failing for 200+ million Large Objects Andrew Dunstan <[email protected]>
2021-03-21 16:56 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-21 18:18 ` Re: pg_upgrade failing for 200+ million Large Objects Andrew Dunstan <[email protected]>
2021-03-22 21:36 ` Re: pg_upgrade failing for 200+ million Large Objects Zhihong Yu <[email protected]>
2021-03-22 23:18 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 12:51 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 14:56 ` Re: pg_upgrade failing for 200+ million Large Objects Bruce Momjian <[email protected]>
2021-03-23 17:25 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 18:06 ` Re: pg_upgrade failing for 200+ million Large Objects Bruce Momjian <[email protected]>
2021-03-23 18:23 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 18:35 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-23 18:54 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 18:59 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-23 19:22 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 19:35 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
@ 2021-03-23 19:59 ` Jan Wieck <[email protected]>
2021-03-23 20:55 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
0 siblings, 1 reply; 43+ messages in thread
From: Jan Wieck @ 2021-03-23 19:59 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: Bruce Momjian <[email protected]>; Zhihong Yu <[email protected]>; Andrew Dunstan <[email protected]>; Magnus Hagander <[email protected]>; Robins Tharakan <[email protected]>; Peter Eisentraut <[email protected]>; pgsql-hackers
On 3/23/21 3:35 PM, Tom Lane wrote:
> Jan Wieck <[email protected]> writes:
>> The problem here is that pg_upgrade itself is invoking a shell again. It
>> is not assembling an array of arguments to pass into exec*(). I'd be a
>> happy camper if it did the latter. But as things are we'd have to add
>> full shell escapeing for arbitrary strings.
>
> Surely we need that (and have it already) anyway?
There are functions to shell escape a single string, like
appendShellString()
but that is hardly enough when a single optarg for --restore-option
could look like any of
--jobs 8
--jobs=8
--jobs='8'
--jobs '8'
--jobs "8"
--jobs="8"
--dont-bother-about-jobs
When placed into a shell string, those things have very different
effects on your args[].
I also want to say that we are overengineering this whole thing. Yes,
there is the problem of shell quoting possibly going wrong as it passes
from one shell to another. But for now this is all about passing a few
numbers down from pg_upgrade to pg_restore (and eventually pg_dump).
Have we even reached a consensus yet on that doing it the way, my patch
is proposing, is the right way to go? Like that emitting BLOB TOC
entries into SECTION_DATA when in binary upgrade mode is a good thing?
Or that bunching all the SQL statements for creating the blob, changing
the ACL and COMMENT and SECLABEL all in one multi-statement-query is.
Maybe we should focus on those details before getting into all the
parameter naming stuff.
Regards, Jan
--
Jan Wieck
Principle Database Engineer
Amazon Web Services
^ permalink raw reply [nested|flat] 43+ messages in thread
* Re: pg_upgrade failing for 200+ million Large Objects
2021-03-08 11:02 Re: pg_upgrade failing for 200+ million Large Objects Tharakan, Robins <[email protected]>
2021-03-08 12:33 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 14:13 ` Re: pg_upgrade failing for 200+ million Large Objects Robins Tharakan <[email protected]>
2021-03-08 16:33 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-08 16:35 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 16:58 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 04:39 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-20 15:23 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 16:55 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-21 11:47 ` Re: pg_upgrade failing for 200+ million Large Objects Andrew Dunstan <[email protected]>
2021-03-21 16:56 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-21 18:18 ` Re: pg_upgrade failing for 200+ million Large Objects Andrew Dunstan <[email protected]>
2021-03-22 21:36 ` Re: pg_upgrade failing for 200+ million Large Objects Zhihong Yu <[email protected]>
2021-03-22 23:18 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 12:51 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 14:56 ` Re: pg_upgrade failing for 200+ million Large Objects Bruce Momjian <[email protected]>
2021-03-23 17:25 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 18:06 ` Re: pg_upgrade failing for 200+ million Large Objects Bruce Momjian <[email protected]>
2021-03-23 18:23 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 18:35 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-23 18:54 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 18:59 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-23 19:22 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 19:35 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-23 19:59 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
@ 2021-03-23 20:55 ` Tom Lane <[email protected]>
2021-03-24 16:04 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
0 siblings, 1 reply; 43+ messages in thread
From: Tom Lane @ 2021-03-23 20:55 UTC (permalink / raw)
To: Jan Wieck <[email protected]>; +Cc: Bruce Momjian <[email protected]>; Zhihong Yu <[email protected]>; Andrew Dunstan <[email protected]>; Magnus Hagander <[email protected]>; Robins Tharakan <[email protected]>; Peter Eisentraut <[email protected]>; pgsql-hackers
Jan Wieck <[email protected]> writes:
> Have we even reached a consensus yet on that doing it the way, my patch
> is proposing, is the right way to go? Like that emitting BLOB TOC
> entries into SECTION_DATA when in binary upgrade mode is a good thing?
> Or that bunching all the SQL statements for creating the blob, changing
> the ACL and COMMENT and SECLABEL all in one multi-statement-query is.
Now you're asking for actual review effort, which is a little hard
to come by towards the tail end of the last CF of a cycle. I'm
interested in this topic, but I can't justify spending much time
on it right now.
regards, tom lane
^ permalink raw reply [nested|flat] 43+ messages in thread
* Re: pg_upgrade failing for 200+ million Large Objects
2021-03-08 11:02 Re: pg_upgrade failing for 200+ million Large Objects Tharakan, Robins <[email protected]>
2021-03-08 12:33 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 14:13 ` Re: pg_upgrade failing for 200+ million Large Objects Robins Tharakan <[email protected]>
2021-03-08 16:33 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-08 16:35 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 16:58 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 04:39 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-20 15:23 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 16:55 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-21 11:47 ` Re: pg_upgrade failing for 200+ million Large Objects Andrew Dunstan <[email protected]>
2021-03-21 16:56 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-21 18:18 ` Re: pg_upgrade failing for 200+ million Large Objects Andrew Dunstan <[email protected]>
2021-03-22 21:36 ` Re: pg_upgrade failing for 200+ million Large Objects Zhihong Yu <[email protected]>
2021-03-22 23:18 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 12:51 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 14:56 ` Re: pg_upgrade failing for 200+ million Large Objects Bruce Momjian <[email protected]>
2021-03-23 17:25 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 18:06 ` Re: pg_upgrade failing for 200+ million Large Objects Bruce Momjian <[email protected]>
2021-03-23 18:23 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 18:35 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-23 18:54 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 18:59 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-23 19:22 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 19:35 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-23 19:59 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 20:55 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
@ 2021-03-24 16:04 ` Jan Wieck <[email protected]>
2021-03-24 16:05 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
0 siblings, 1 reply; 43+ messages in thread
From: Jan Wieck @ 2021-03-24 16:04 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: Bruce Momjian <[email protected]>; Zhihong Yu <[email protected]>; Andrew Dunstan <[email protected]>; Magnus Hagander <[email protected]>; Robins Tharakan <[email protected]>; Peter Eisentraut <[email protected]>; pgsql-hackers
On 3/23/21 4:55 PM, Tom Lane wrote:
> Jan Wieck <[email protected]> writes:
>> Have we even reached a consensus yet on that doing it the way, my patch
>> is proposing, is the right way to go? Like that emitting BLOB TOC
>> entries into SECTION_DATA when in binary upgrade mode is a good thing?
>> Or that bunching all the SQL statements for creating the blob, changing
>> the ACL and COMMENT and SECLABEL all in one multi-statement-query is.
>
> Now you're asking for actual review effort, which is a little hard
> to come by towards the tail end of the last CF of a cycle. I'm
> interested in this topic, but I can't justify spending much time
> on it right now.
Understood.
In any case I changed the options so that they behave the same way, the
existing -o and -O (for old/new postmaster options) work. I don't think
it would be wise to have option forwarding work differently between
options for postmaster and options for pg_dump/pg_restore.
Regards, Jan
--
Jan Wieck
Principle Database Engineer
Amazon Web Services
^ permalink raw reply [nested|flat] 43+ messages in thread
* Re: pg_upgrade failing for 200+ million Large Objects
2021-03-08 11:02 Re: pg_upgrade failing for 200+ million Large Objects Tharakan, Robins <[email protected]>
2021-03-08 12:33 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 14:13 ` Re: pg_upgrade failing for 200+ million Large Objects Robins Tharakan <[email protected]>
2021-03-08 16:33 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-08 16:35 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 16:58 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 04:39 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-20 15:23 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 16:55 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-21 11:47 ` Re: pg_upgrade failing for 200+ million Large Objects Andrew Dunstan <[email protected]>
2021-03-21 16:56 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-21 18:18 ` Re: pg_upgrade failing for 200+ million Large Objects Andrew Dunstan <[email protected]>
2021-03-22 21:36 ` Re: pg_upgrade failing for 200+ million Large Objects Zhihong Yu <[email protected]>
2021-03-22 23:18 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 12:51 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 14:56 ` Re: pg_upgrade failing for 200+ million Large Objects Bruce Momjian <[email protected]>
2021-03-23 17:25 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 18:06 ` Re: pg_upgrade failing for 200+ million Large Objects Bruce Momjian <[email protected]>
2021-03-23 18:23 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 18:35 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-23 18:54 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 18:59 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-23 19:22 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 19:35 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-23 19:59 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-23 20:55 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-24 16:04 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
@ 2021-03-24 16:05 ` Jan Wieck <[email protected]>
0 siblings, 0 replies; 43+ messages in thread
From: Jan Wieck @ 2021-03-24 16:05 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: Bruce Momjian <[email protected]>; Zhihong Yu <[email protected]>; Andrew Dunstan <[email protected]>; Magnus Hagander <[email protected]>; Robins Tharakan <[email protected]>; Peter Eisentraut <[email protected]>; pgsql-hackers
On 3/24/21 12:04 PM, Jan Wieck wrote:
> In any case I changed the options so that they behave the same way, the
> existing -o and -O (for old/new postmaster options) work. I don't think
> it would be wise to have option forwarding work differently between
> options for postmaster and options for pg_dump/pg_restore.
Attaching the actual diff might help.
--
Jan Wieck
Principle Database Engineer
Amazon Web Services
Attachments:
[text/x-patch] pg_upgrade_improvements.v4.diff (24.3K, 2-pg_upgrade_improvements.v4.diff)
download | inline diff:
diff --git a/src/bin/pg_dump/parallel.c b/src/bin/pg_dump/parallel.c
index c7351a4..4a611d0 100644
--- a/src/bin/pg_dump/parallel.c
+++ b/src/bin/pg_dump/parallel.c
@@ -864,6 +864,11 @@ RunWorker(ArchiveHandle *AH, ParallelSlot *slot)
WaitForCommands(AH, pipefd);
/*
+ * Close an eventually open BLOB batch transaction.
+ */
+ CommitBlobTransaction((Archive *)AH);
+
+ /*
* Disconnect from database and clean up.
*/
set_cancel_slot_archive(slot, NULL);
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index 0296b9b..cd8a590 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -203,6 +203,8 @@ typedef struct Archive
int numWorkers; /* number of parallel processes */
char *sync_snapshot_id; /* sync snapshot id for parallel operation */
+ int blobBatchSize; /* # of blobs to restore per transaction */
+
/* info needed for string escaping */
int encoding; /* libpq code for client_encoding */
bool std_strings; /* standard_conforming_strings */
@@ -269,6 +271,7 @@ extern void WriteData(Archive *AH, const void *data, size_t dLen);
extern int StartBlob(Archive *AH, Oid oid);
extern int EndBlob(Archive *AH, Oid oid);
+extern void CommitBlobTransaction(Archive *AH);
extern void CloseArchive(Archive *AH);
extern void SetArchiveOptions(Archive *AH, DumpOptions *dopt, RestoreOptions *ropt);
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 1f82c64..8331e8a 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -68,6 +68,7 @@ typedef struct _parallelReadyList
bool sorted; /* are valid entries currently sorted? */
} ParallelReadyList;
+static int blobBatchCount = 0;
static ArchiveHandle *_allocAH(const char *FileSpec, const ArchiveFormat fmt,
const int compression, bool dosync, ArchiveMode mode,
@@ -265,6 +266,8 @@ CloseArchive(Archive *AHX)
int res = 0;
ArchiveHandle *AH = (ArchiveHandle *) AHX;
+ CommitBlobTransaction(AHX);
+
AH->ClosePtr(AH);
/* Close the output */
@@ -279,6 +282,23 @@ CloseArchive(Archive *AHX)
/* Public */
void
+CommitBlobTransaction(Archive *AHX)
+{
+ ArchiveHandle *AH = (ArchiveHandle *) AHX;
+
+ if (blobBatchCount > 0)
+ {
+ ahprintf(AH, "--\n");
+ ahprintf(AH, "-- End BLOB restore batch\n");
+ ahprintf(AH, "--\n");
+ ahprintf(AH, "COMMIT;\n\n");
+
+ blobBatchCount = 0;
+ }
+}
+
+/* Public */
+void
SetArchiveOptions(Archive *AH, DumpOptions *dopt, RestoreOptions *ropt)
{
/* Caller can omit dump options, in which case we synthesize them */
@@ -3531,6 +3551,57 @@ _printTocEntry(ArchiveHandle *AH, TocEntry *te, bool isData)
{
RestoreOptions *ropt = AH->public.ropt;
+ /* We restore BLOBs in batches to reduce XID consumption */
+ if (strcmp(te->desc, "BLOB") == 0 && AH->public.blobBatchSize > 0)
+ {
+ if (blobBatchCount > 0)
+ {
+ /* We are inside a BLOB restore transaction */
+ if (blobBatchCount >= AH->public.blobBatchSize)
+ {
+ /*
+ * We did reach the batch size with the previous BLOB.
+ * Commit and start a new batch.
+ */
+ ahprintf(AH, "--\n");
+ ahprintf(AH, "-- BLOB batch size reached\n");
+ ahprintf(AH, "--\n");
+ ahprintf(AH, "COMMIT;\n");
+ ahprintf(AH, "BEGIN;\n\n");
+
+ blobBatchCount = 1;
+ }
+ else
+ {
+ /* This one still fits into the current batch */
+ blobBatchCount++;
+ }
+ }
+ else
+ {
+ /* Not inside a transaction, start a new batch */
+ ahprintf(AH, "--\n");
+ ahprintf(AH, "-- Start BLOB restore batch\n");
+ ahprintf(AH, "--\n");
+ ahprintf(AH, "BEGIN;\n\n");
+
+ blobBatchCount = 1;
+ }
+ }
+ else
+ {
+ /* Not a BLOB. If we have a BLOB batch open, close it. */
+ if (blobBatchCount > 0)
+ {
+ ahprintf(AH, "--\n");
+ ahprintf(AH, "-- End BLOB restore batch\n");
+ ahprintf(AH, "--\n");
+ ahprintf(AH, "COMMIT;\n\n");
+
+ blobBatchCount = 0;
+ }
+ }
+
/* Select owner, schema, tablespace and default AM as necessary */
_becomeOwner(AH, te);
_selectOutputSchema(AH, te->namespace);
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index f8bec3f..f153f08 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -165,12 +165,20 @@ static void guessConstraintInheritance(TableInfo *tblinfo, int numTables);
static void dumpComment(Archive *fout, const char *type, const char *name,
const char *namespace, const char *owner,
CatalogId catalogId, int subid, DumpId dumpId);
+static bool dumpCommentQuery(Archive *fout, PQExpBuffer query, PQExpBuffer tag,
+ const char *type, const char *name,
+ const char *namespace, const char *owner,
+ CatalogId catalogId, int subid, DumpId dumpId);
static int findComments(Archive *fout, Oid classoid, Oid objoid,
CommentItem **items);
static int collectComments(Archive *fout, CommentItem **items);
static void dumpSecLabel(Archive *fout, const char *type, const char *name,
const char *namespace, const char *owner,
CatalogId catalogId, int subid, DumpId dumpId);
+static bool dumpSecLabelQuery(Archive *fout, PQExpBuffer query, PQExpBuffer tag,
+ const char *type, const char *name,
+ const char *namespace, const char *owner,
+ CatalogId catalogId, int subid, DumpId dumpId);
static int findSecLabels(Archive *fout, Oid classoid, Oid objoid,
SecLabelItem **items);
static int collectSecLabels(Archive *fout, SecLabelItem **items);
@@ -227,6 +235,13 @@ static DumpId dumpACL(Archive *fout, DumpId objDumpId, DumpId altDumpId,
const char *nspname, const char *owner,
const char *acls, const char *racls,
const char *initacls, const char *initracls);
+static bool dumpACLQuery(Archive *fout, PQExpBuffer query, PQExpBuffer tag,
+ DumpId objDumpId, DumpId altDumpId,
+ const char *type, const char *name,
+ const char *subname,
+ const char *nspname, const char *owner,
+ const char *acls, const char *racls,
+ const char *initacls, const char *initracls);
static void getDependencies(Archive *fout);
static void BuildArchiveDependencies(Archive *fout);
@@ -3468,11 +3483,44 @@ dumpBlob(Archive *fout, const BlobInfo *binfo)
{
PQExpBuffer cquery = createPQExpBuffer();
PQExpBuffer dquery = createPQExpBuffer();
+ PQExpBuffer tag = createPQExpBuffer();
+ teSection section = SECTION_PRE_DATA;
appendPQExpBuffer(cquery,
"SELECT pg_catalog.lo_create('%s');\n",
binfo->dobj.name);
+ /*
+ * In binary upgrade mode we put all the queries to restore
+ * one large object into a single TOC entry and emit it as
+ * SECTION_DATA so that they can be restored in parallel.
+ */
+ if (fout->dopt->binary_upgrade)
+ {
+ section = SECTION_DATA;
+
+ /* Dump comment if any */
+ if (binfo->dobj.dump & DUMP_COMPONENT_COMMENT)
+ dumpCommentQuery(fout, cquery, tag, "LARGE OBJECT",
+ binfo->dobj.name, NULL, binfo->rolname,
+ binfo->dobj.catId, 0, binfo->dobj.dumpId);
+
+ /* Dump security label if any */
+ if (binfo->dobj.dump & DUMP_COMPONENT_SECLABEL)
+ dumpSecLabelQuery(fout, cquery, tag, "LARGE OBJECT",
+ binfo->dobj.name,
+ NULL, binfo->rolname,
+ binfo->dobj.catId, 0, binfo->dobj.dumpId);
+
+ /* Dump ACL if any */
+ if (binfo->blobacl && (binfo->dobj.dump & DUMP_COMPONENT_ACL))
+ dumpACLQuery(fout, cquery, tag,
+ binfo->dobj.dumpId, InvalidDumpId, "LARGE OBJECT",
+ binfo->dobj.name, NULL,
+ NULL, binfo->rolname, binfo->blobacl, binfo->rblobacl,
+ binfo->initblobacl, binfo->initrblobacl);
+ }
+
appendPQExpBuffer(dquery,
"SELECT pg_catalog.lo_unlink('%s');\n",
binfo->dobj.name);
@@ -3482,28 +3530,31 @@ dumpBlob(Archive *fout, const BlobInfo *binfo)
ARCHIVE_OPTS(.tag = binfo->dobj.name,
.owner = binfo->rolname,
.description = "BLOB",
- .section = SECTION_PRE_DATA,
+ .section = section,
.createStmt = cquery->data,
.dropStmt = dquery->data));
- /* Dump comment if any */
- if (binfo->dobj.dump & DUMP_COMPONENT_COMMENT)
- dumpComment(fout, "LARGE OBJECT", binfo->dobj.name,
- NULL, binfo->rolname,
- binfo->dobj.catId, 0, binfo->dobj.dumpId);
-
- /* Dump security label if any */
- if (binfo->dobj.dump & DUMP_COMPONENT_SECLABEL)
- dumpSecLabel(fout, "LARGE OBJECT", binfo->dobj.name,
- NULL, binfo->rolname,
- binfo->dobj.catId, 0, binfo->dobj.dumpId);
-
- /* Dump ACL if any */
- if (binfo->blobacl && (binfo->dobj.dump & DUMP_COMPONENT_ACL))
- dumpACL(fout, binfo->dobj.dumpId, InvalidDumpId, "LARGE OBJECT",
- binfo->dobj.name, NULL,
- NULL, binfo->rolname, binfo->blobacl, binfo->rblobacl,
- binfo->initblobacl, binfo->initrblobacl);
+ if (!fout->dopt->binary_upgrade)
+ {
+ /* Dump comment if any */
+ if (binfo->dobj.dump & DUMP_COMPONENT_COMMENT)
+ dumpComment(fout, "LARGE OBJECT", binfo->dobj.name,
+ NULL, binfo->rolname,
+ binfo->dobj.catId, 0, binfo->dobj.dumpId);
+
+ /* Dump security label if any */
+ if (binfo->dobj.dump & DUMP_COMPONENT_SECLABEL)
+ dumpSecLabel(fout, "LARGE OBJECT", binfo->dobj.name,
+ NULL, binfo->rolname,
+ binfo->dobj.catId, 0, binfo->dobj.dumpId);
+
+ /* Dump ACL if any */
+ if (binfo->blobacl && (binfo->dobj.dump & DUMP_COMPONENT_ACL))
+ dumpACL(fout, binfo->dobj.dumpId, InvalidDumpId, "LARGE OBJECT",
+ binfo->dobj.name, NULL,
+ NULL, binfo->rolname, binfo->blobacl, binfo->rblobacl,
+ binfo->initblobacl, binfo->initrblobacl);
+ }
destroyPQExpBuffer(cquery);
destroyPQExpBuffer(dquery);
@@ -9868,25 +9919,56 @@ dumpComment(Archive *fout, const char *type, const char *name,
const char *namespace, const char *owner,
CatalogId catalogId, int subid, DumpId dumpId)
{
+ PQExpBuffer query = createPQExpBuffer();
+ PQExpBuffer tag = createPQExpBuffer();
+
+ if (dumpCommentQuery(fout, query, tag, type, name, namespace, owner,
+ catalogId, subid, dumpId))
+ {
+ /*
+ * We mark comments as SECTION_NONE because they really belong in the
+ * same section as their parent, whether that is pre-data or
+ * post-data.
+ */
+ ArchiveEntry(fout, nilCatalogId, createDumpId(),
+ ARCHIVE_OPTS(.tag = tag->data,
+ .namespace = namespace,
+ .owner = owner,
+ .description = "COMMENT",
+ .section = SECTION_NONE,
+ .createStmt = query->data,
+ .deps = &dumpId,
+ .nDeps = 1));
+ }
+ destroyPQExpBuffer(query);
+ destroyPQExpBuffer(tag);
+}
+
+static bool
+dumpCommentQuery(Archive *fout, PQExpBuffer query, PQExpBuffer tag,
+ const char *type, const char *name,
+ const char *namespace, const char *owner,
+ CatalogId catalogId, int subid, DumpId dumpId)
+{
DumpOptions *dopt = fout->dopt;
CommentItem *comments;
int ncomments;
/* do nothing, if --no-comments is supplied */
if (dopt->no_comments)
- return;
+ return false;
/* Comments are schema not data ... except blob comments are data */
if (strcmp(type, "LARGE OBJECT") != 0)
{
if (dopt->dataOnly)
- return;
+ return false;
}
else
{
/* We do dump blob comments in binary-upgrade mode */
if (dopt->schemaOnly && !dopt->binary_upgrade)
- return;
+ return false;
}
/* Search for comments associated with catalogId, using table */
@@ -9905,9 +9987,6 @@ dumpComment(Archive *fout, const char *type, const char *name,
/* If a comment exists, build COMMENT ON statement */
if (ncomments > 0)
{
- PQExpBuffer query = createPQExpBuffer();
- PQExpBuffer tag = createPQExpBuffer();
-
appendPQExpBuffer(query, "COMMENT ON %s ", type);
if (namespace && *namespace)
appendPQExpBuffer(query, "%s.", fmtId(namespace));
@@ -9917,24 +9996,10 @@ dumpComment(Archive *fout, const char *type, const char *name,
appendPQExpBuffer(tag, "%s %s", type, name);
- /*
- * We mark comments as SECTION_NONE because they really belong in the
- * same section as their parent, whether that is pre-data or
- * post-data.
- */
- ArchiveEntry(fout, nilCatalogId, createDumpId(),
- ARCHIVE_OPTS(.tag = tag->data,
- .namespace = namespace,
- .owner = owner,
- .description = "COMMENT",
- .section = SECTION_NONE,
- .createStmt = query->data,
- .deps = &dumpId,
- .nDeps = 1));
-
- destroyPQExpBuffer(query);
- destroyPQExpBuffer(tag);
+ return true;
}
+
+ return false;
}
/*
@@ -15070,18 +15135,63 @@ dumpACL(Archive *fout, DumpId objDumpId, DumpId altDumpId,
const char *initacls, const char *initracls)
{
DumpId aclDumpId = InvalidDumpId;
+ PQExpBuffer query = createPQExpBuffer();
+ PQExpBuffer tag = createPQExpBuffer();
+
+ if (dumpACLQuery(fout, query, tag, objDumpId, altDumpId,
+ type, name, subname, nspname, owner,
+ acls, racls, initacls, initracls))
+ {
+ DumpId aclDeps[2];
+ int nDeps = 0;
+
+ if (subname)
+ appendPQExpBuffer(tag, "COLUMN %s.%s", name, subname);
+ else
+ appendPQExpBuffer(tag, "%s %s", type, name);
+
+ aclDeps[nDeps++] = objDumpId;
+ if (altDumpId != InvalidDumpId)
+ aclDeps[nDeps++] = altDumpId;
+
+ aclDumpId = createDumpId();
+
+ ArchiveEntry(fout, nilCatalogId, aclDumpId,
+ ARCHIVE_OPTS(.tag = tag->data,
+ .namespace = nspname,
+ .owner = owner,
+ .description = "ACL",
+ .section = SECTION_NONE,
+ .createStmt = query->data,
+ .deps = aclDeps,
+ .nDeps = nDeps));
+
+ }
+
+ destroyPQExpBuffer(query);
+ destroyPQExpBuffer(tag);
+
+ return aclDumpId;
+}
+
+static bool
+dumpACLQuery(Archive *fout, PQExpBuffer query, PQExpBuffer tag,
+ DumpId objDumpId, DumpId altDumpId,
+ const char *type, const char *name, const char *subname,
+ const char *nspname, const char *owner,
+ const char *acls, const char *racls,
+ const char *initacls, const char *initracls)
+{
DumpOptions *dopt = fout->dopt;
- PQExpBuffer sql;
+ bool haveACL = false;
/* Do nothing if ACL dump is not enabled */
if (dopt->aclsSkip)
- return InvalidDumpId;
+ return false;
/* --data-only skips ACLs *except* BLOB ACLs */
if (dopt->dataOnly && strcmp(type, "LARGE OBJECT") != 0)
- return InvalidDumpId;
-
- sql = createPQExpBuffer();
+ return false;
/*
* Check to see if this object has had any initial ACLs included for it.
@@ -15093,54 +15203,31 @@ dumpACL(Archive *fout, DumpId objDumpId, DumpId altDumpId,
*/
if (strlen(initacls) != 0 || strlen(initracls) != 0)
{
- appendPQExpBufferStr(sql, "SELECT pg_catalog.binary_upgrade_set_record_init_privs(true);\n");
+ haveACL = true;
+ appendPQExpBufferStr(query, "SELECT pg_catalog.binary_upgrade_set_record_init_privs(true);\n");
if (!buildACLCommands(name, subname, nspname, type,
initacls, initracls, owner,
- "", fout->remoteVersion, sql))
+ "", fout->remoteVersion, query))
fatal("could not parse initial GRANT ACL list (%s) or initial REVOKE ACL list (%s) for object \"%s\" (%s)",
initacls, initracls, name, type);
- appendPQExpBufferStr(sql, "SELECT pg_catalog.binary_upgrade_set_record_init_privs(false);\n");
+ appendPQExpBufferStr(query, "SELECT pg_catalog.binary_upgrade_set_record_init_privs(false);\n");
}
if (!buildACLCommands(name, subname, nspname, type,
acls, racls, owner,
- "", fout->remoteVersion, sql))
+ "", fout->remoteVersion, query))
fatal("could not parse GRANT ACL list (%s) or REVOKE ACL list (%s) for object \"%s\" (%s)",
acls, racls, name, type);
- if (sql->len > 0)
+ if (haveACL && tag != NULL)
{
- PQExpBuffer tag = createPQExpBuffer();
- DumpId aclDeps[2];
- int nDeps = 0;
-
if (subname)
appendPQExpBuffer(tag, "COLUMN %s.%s", name, subname);
else
appendPQExpBuffer(tag, "%s %s", type, name);
-
- aclDeps[nDeps++] = objDumpId;
- if (altDumpId != InvalidDumpId)
- aclDeps[nDeps++] = altDumpId;
-
- aclDumpId = createDumpId();
-
- ArchiveEntry(fout, nilCatalogId, aclDumpId,
- ARCHIVE_OPTS(.tag = tag->data,
- .namespace = nspname,
- .owner = owner,
- .description = "ACL",
- .section = SECTION_NONE,
- .createStmt = sql->data,
- .deps = aclDeps,
- .nDeps = nDeps));
-
- destroyPQExpBuffer(tag);
}
- destroyPQExpBuffer(sql);
-
- return aclDumpId;
+ return haveACL;
}
/*
@@ -15166,34 +15253,58 @@ dumpSecLabel(Archive *fout, const char *type, const char *name,
const char *namespace, const char *owner,
CatalogId catalogId, int subid, DumpId dumpId)
{
+ PQExpBuffer query = createPQExpBuffer();
+ PQExpBuffer tag = createPQExpBuffer();
+
+ if (dumpSecLabelQuery(fout, query, tag, type, name,
+ namespace, owner, catalogId, subid, dumpId))
+ {
+ ArchiveEntry(fout, nilCatalogId, createDumpId(),
+ ARCHIVE_OPTS(.tag = tag->data,
+ .namespace = namespace,
+ .owner = owner,
+ .description = "SECURITY LABEL",
+ .section = SECTION_NONE,
+ .createStmt = query->data,
+ .deps = &dumpId,
+ .nDeps = 1));
+ }
+
+ destroyPQExpBuffer(query);
+ destroyPQExpBuffer(tag);
+}
+
+static bool
+dumpSecLabelQuery(Archive *fout, PQExpBuffer query, PQExpBuffer tag,
+ const char *type, const char *name,
+ const char *namespace, const char *owner,
+ CatalogId catalogId, int subid, DumpId dumpId)
+{
DumpOptions *dopt = fout->dopt;
SecLabelItem *labels;
int nlabels;
int i;
- PQExpBuffer query;
/* do nothing, if --no-security-labels is supplied */
if (dopt->no_security_labels)
- return;
+ return false;
/* Security labels are schema not data ... except blob labels are data */
if (strcmp(type, "LARGE OBJECT") != 0)
{
if (dopt->dataOnly)
- return;
+ return false;
}
else
{
/* We do dump blob security labels in binary-upgrade mode */
if (dopt->schemaOnly && !dopt->binary_upgrade)
- return;
+ return false;
}
/* Search for security labels associated with catalogId, using table */
nlabels = findSecLabels(fout, catalogId.tableoid, catalogId.oid, &labels);
- query = createPQExpBuffer();
-
for (i = 0; i < nlabels; i++)
{
/*
@@ -15214,22 +15325,11 @@ dumpSecLabel(Archive *fout, const char *type, const char *name,
if (query->len > 0)
{
- PQExpBuffer tag = createPQExpBuffer();
-
appendPQExpBuffer(tag, "%s %s", type, name);
- ArchiveEntry(fout, nilCatalogId, createDumpId(),
- ARCHIVE_OPTS(.tag = tag->data,
- .namespace = namespace,
- .owner = owner,
- .description = "SECURITY LABEL",
- .section = SECTION_NONE,
- .createStmt = query->data,
- .deps = &dumpId,
- .nDeps = 1));
- destroyPQExpBuffer(tag);
+ return true;
}
- destroyPQExpBuffer(query);
+ return false;
}
/*
diff --git a/src/bin/pg_dump/pg_restore.c b/src/bin/pg_dump/pg_restore.c
index 589b4ae..b16db03 100644
--- a/src/bin/pg_dump/pg_restore.c
+++ b/src/bin/pg_dump/pg_restore.c
@@ -59,6 +59,7 @@ main(int argc, char **argv)
int c;
int exit_code;
int numWorkers = 1;
+ int blobBatchSize = 0;
Archive *AH;
char *inputFileSpec;
static int disable_triggers = 0;
@@ -120,6 +121,7 @@ main(int argc, char **argv)
{"no-publications", no_argument, &no_publications, 1},
{"no-security-labels", no_argument, &no_security_labels, 1},
{"no-subscriptions", no_argument, &no_subscriptions, 1},
+ {"restore-blob-batch-size", required_argument, NULL, 4},
{NULL, 0, NULL, 0}
};
@@ -280,6 +282,10 @@ main(int argc, char **argv)
set_dump_section(optarg, &(opts->dumpSections));
break;
+ case 4: /* # of blobs to restore per transaction */
+ blobBatchSize = atoi(optarg);
+ break;
+
default:
fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
exit_nicely(1);
@@ -434,6 +440,7 @@ main(int argc, char **argv)
SortTocFromFile(AH);
AH->numWorkers = numWorkers;
+ AH->blobBatchSize = blobBatchSize;
if (opts->tocSummary)
PrintTOCSummary(AH);
@@ -506,6 +513,8 @@ usage(const char *progname)
printf(_(" --use-set-session-authorization\n"
" use SET SESSION AUTHORIZATION commands instead of\n"
" ALTER OWNER commands to set ownership\n"));
+ printf(_(" --restore-blob-batch-size=NUM\n"
+ " attempt to restore NUM large objects per transaction\n"));
printf(_("\nConnection options:\n"));
printf(_(" -h, --host=HOSTNAME database server host or socket directory\n"));
diff --git a/src/bin/pg_upgrade/dump.c b/src/bin/pg_upgrade/dump.c
index 33d9591..183bb6d 100644
--- a/src/bin/pg_upgrade/dump.c
+++ b/src/bin/pg_upgrade/dump.c
@@ -52,8 +52,11 @@ generate_old_dump(void)
parallel_exec_prog(log_file_name, NULL,
"\"%s/pg_dump\" %s --schema-only --quote-all-identifiers "
+ "%s "
"--binary-upgrade --format=custom %s %s --file=\"%s\" %s",
new_cluster.bindir, cluster_conn_opts(&old_cluster),
+ user_opts.pg_dump_opts ?
+ user_opts.pg_dump_opts : "",
log_opts.verbose ? "--verbose" : "",
user_opts.ind_coll_unknown ?
"--index-collation-versions-unknown" : "",
diff --git a/src/bin/pg_upgrade/option.c b/src/bin/pg_upgrade/option.c
index 9c9b313..d0efb9f 100644
--- a/src/bin/pg_upgrade/option.c
+++ b/src/bin/pg_upgrade/option.c
@@ -57,6 +57,8 @@ parseCommandLine(int argc, char *argv[])
{"verbose", no_argument, NULL, 'v'},
{"clone", no_argument, NULL, 1},
{"index-collation-versions-unknown", no_argument, NULL, 2},
+ {"dump-options", required_argument, NULL, 3},
+ {"restore-options", required_argument, NULL, 4},
{NULL, 0, NULL, 0}
};
@@ -208,6 +210,34 @@ parseCommandLine(int argc, char *argv[])
user_opts.ind_coll_unknown = true;
break;
+ case 3:
+ /* append option? */
+ if (!user_opts.pg_dump_opts)
+ user_opts.pg_dump_opts = pg_strdup(optarg);
+ else
+ {
+ char *old_opts = user_opts.pg_dump_opts;
+
+ user_opts.pg_dump_opts = psprintf("%s %s",
+ old_opts, optarg);
+ free(old_opts);
+ }
+ break;
+
+ case 4:
+ /* append option? */
+ if (!user_opts.pg_restore_opts)
+ user_opts.pg_restore_opts = pg_strdup(optarg);
+ else
+ {
+ char *old_opts = user_opts.pg_restore_opts;
+
+ user_opts.pg_restore_opts = psprintf("%s %s",
+ old_opts, optarg);
+ free(old_opts);
+ }
+ break;
+
default:
fprintf(stderr, _("Try \"%s --help\" for more information.\n"),
os_info.progname);
@@ -314,6 +344,8 @@ usage(void)
printf(_(" --clone clone instead of copying files to new cluster\n"));
printf(_(" --index-collation-versions-unknown\n"));
printf(_(" mark text indexes as needing to be rebuilt\n"));
+ printf(_(" --dump-options=OPTIONS options to pass to pg_dump\n"));
+ printf(_(" --restore-options=OPTIONS options to pass to pg_restore\n"));
printf(_(" -?, --help show this help, then exit\n"));
printf(_("\n"
"Before running pg_upgrade you must:\n"
diff --git a/src/bin/pg_upgrade/pg_upgrade.c b/src/bin/pg_upgrade/pg_upgrade.c
index e23b8ca..6f6b12d 100644
--- a/src/bin/pg_upgrade/pg_upgrade.c
+++ b/src/bin/pg_upgrade/pg_upgrade.c
@@ -348,10 +348,13 @@ create_new_objects(void)
true,
true,
"\"%s/pg_restore\" %s %s --exit-on-error --verbose "
+ "%s "
"--dbname postgres \"%s\"",
new_cluster.bindir,
cluster_conn_opts(&new_cluster),
create_opts,
+ user_opts.pg_restore_opts ?
+ user_opts.pg_restore_opts : "",
sql_file_name);
break; /* done once we've processed template1 */
@@ -385,10 +388,13 @@ create_new_objects(void)
parallel_exec_prog(log_file_name,
NULL,
"\"%s/pg_restore\" %s %s --exit-on-error --verbose "
+ "%s "
"--dbname template1 \"%s\"",
new_cluster.bindir,
cluster_conn_opts(&new_cluster),
create_opts,
+ user_opts.pg_restore_opts ?
+ user_opts.pg_restore_opts : "",
sql_file_name);
}
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index 919a784..4b7959e 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -293,6 +293,8 @@ typedef struct
int jobs; /* number of processes/threads to use */
char *socketdir; /* directory to use for Unix sockets */
bool ind_coll_unknown; /* mark unknown index collation versions */
+ char *pg_dump_opts; /* options to pass to pg_dump */
+ char *pg_restore_opts; /* options to pass to pg_dump */
} UserOpts;
typedef struct
^ permalink raw reply [nested|flat] 43+ messages in thread
* Fix pg_upgrade to preserve datdba (was: Re: pg_upgrade failing for 200+ million Large Objects)
2021-03-08 11:02 Re: pg_upgrade failing for 200+ million Large Objects Tharakan, Robins <[email protected]>
2021-03-08 12:33 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 14:13 ` Re: pg_upgrade failing for 200+ million Large Objects Robins Tharakan <[email protected]>
2021-03-08 16:33 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-08 16:35 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 16:58 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 04:39 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
@ 2021-03-21 16:50 ` Jan Wieck <[email protected]>
2021-03-21 16:57 ` Re: Fix pg_upgrade to preserve datdba (was: Re: pg_upgrade failing for 200+ million Large Objects) Tom Lane <[email protected]>
2 siblings, 1 reply; 43+ messages in thread
From: Jan Wieck @ 2021-03-21 16:50 UTC (permalink / raw)
To: Tom Lane <[email protected]>; Magnus Hagander <[email protected]>; +Cc: Robins Tharakan <[email protected]>; Peter Eisentraut <[email protected]>; pgsql-hackers
On 3/20/21 12:39 AM, Jan Wieck wrote:
> On the way pg_upgrade also mangles the pg_database.datdba
> (all databases are owned by postgres after an upgrade; will submit a
> separate patch for that as I consider that a bug by itself).
Patch attached.
Regards, Jan
--
Jan Wieck
Principle Database Engineer
Amazon Web Services
Attachments:
[text/x-patch] pg_upgrade-preserve-datdba.v1.diff (3.6K, 2-pg_upgrade-preserve-datdba.v1.diff)
download | inline diff:
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 5d9a26c..38f7202 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -344,6 +344,7 @@ get_db_infos(ClusterInfo *cluster)
DbInfo *dbinfos;
int i_datname,
i_oid,
+ i_datdba,
i_encoding,
i_datcollate,
i_datctype,
@@ -351,9 +352,12 @@ get_db_infos(ClusterInfo *cluster)
char query[QUERY_ALLOC];
snprintf(query, sizeof(query),
- "SELECT d.oid, d.datname, d.encoding, d.datcollate, d.datctype, "
+ "SELECT d.oid, d.datname, u.rolname, d.encoding, "
+ "d.datcollate, d.datctype, "
"%s AS spclocation "
"FROM pg_catalog.pg_database d "
+ " JOIN pg_catalog.pg_authid u "
+ " ON d.datdba = u.oid "
" LEFT OUTER JOIN pg_catalog.pg_tablespace t "
" ON d.dattablespace = t.oid "
"WHERE d.datallowconn = true "
@@ -367,6 +371,7 @@ get_db_infos(ClusterInfo *cluster)
i_oid = PQfnumber(res, "oid");
i_datname = PQfnumber(res, "datname");
+ i_datdba = PQfnumber(res, "rolname");
i_encoding = PQfnumber(res, "encoding");
i_datcollate = PQfnumber(res, "datcollate");
i_datctype = PQfnumber(res, "datctype");
@@ -379,6 +384,7 @@ get_db_infos(ClusterInfo *cluster)
{
dbinfos[tupnum].db_oid = atooid(PQgetvalue(res, tupnum, i_oid));
dbinfos[tupnum].db_name = pg_strdup(PQgetvalue(res, tupnum, i_datname));
+ dbinfos[tupnum].db_owner = pg_strdup(PQgetvalue(res, tupnum, i_datdba));
dbinfos[tupnum].db_encoding = atoi(PQgetvalue(res, tupnum, i_encoding));
dbinfos[tupnum].db_collate = pg_strdup(PQgetvalue(res, tupnum, i_datcollate));
dbinfos[tupnum].db_ctype = pg_strdup(PQgetvalue(res, tupnum, i_datctype));
diff --git a/src/bin/pg_upgrade/pg_upgrade.c b/src/bin/pg_upgrade/pg_upgrade.c
index e23b8ca..8fd9a13 100644
--- a/src/bin/pg_upgrade/pg_upgrade.c
+++ b/src/bin/pg_upgrade/pg_upgrade.c
@@ -378,18 +378,36 @@ create_new_objects(void)
* propagate its database-level properties.
*/
if (strcmp(old_db->db_name, "postgres") == 0)
- create_opts = "--clean --create";
+ {
+ parallel_exec_prog(log_file_name,
+ NULL,
+ "\"%s/pg_restore\" %s --exit-on-error "
+ "--verbose --clean --create "
+ "--dbname template1 \"%s\"",
+ new_cluster.bindir,
+ cluster_conn_opts(&new_cluster),
+ sql_file_name);
+ }
else
- create_opts = "--create";
-
- parallel_exec_prog(log_file_name,
- NULL,
- "\"%s/pg_restore\" %s %s --exit-on-error --verbose "
- "--dbname template1 \"%s\"",
- new_cluster.bindir,
- cluster_conn_opts(&new_cluster),
- create_opts,
- sql_file_name);
+ {
+ exec_prog(log_file_name, NULL, true, true,
+ "\"%s/createdb\" -O \"%s\" %s \"%s\"",
+ new_cluster.bindir,
+ old_db->db_owner,
+ cluster_conn_opts(&new_cluster),
+ old_db->db_name);
+ parallel_exec_prog(log_file_name,
+ NULL,
+ "\"%s/pg_restore\" %s --exit-on-error "
+ "--verbose "
+ "--dbname \"%s\" \"%s\"",
+ new_cluster.bindir,
+ cluster_conn_opts(&new_cluster),
+ old_db->db_name,
+ sql_file_name);
+ }
+
+
}
/* reap all children */
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index 919a784..a3cda97 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -177,6 +177,7 @@ typedef struct
{
Oid db_oid; /* oid of the database */
char *db_name; /* database name */
+ char *db_owner; /* database owner */
char db_tablespace[MAXPGPATH]; /* database default tablespace
* path */
char *db_collate;
^ permalink raw reply [nested|flat] 43+ messages in thread
* Re: Fix pg_upgrade to preserve datdba (was: Re: pg_upgrade failing for 200+ million Large Objects)
2021-03-08 11:02 Re: pg_upgrade failing for 200+ million Large Objects Tharakan, Robins <[email protected]>
2021-03-08 12:33 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 14:13 ` Re: pg_upgrade failing for 200+ million Large Objects Robins Tharakan <[email protected]>
2021-03-08 16:33 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-08 16:35 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 16:58 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 04:39 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-21 16:50 ` Fix pg_upgrade to preserve datdba (was: Re: pg_upgrade failing for 200+ million Large Objects) Jan Wieck <[email protected]>
@ 2021-03-21 16:57 ` Tom Lane <[email protected]>
2021-03-21 17:15 ` Re: Fix pg_upgrade to preserve datdba Jan Wieck <[email protected]>
0 siblings, 1 reply; 43+ messages in thread
From: Tom Lane @ 2021-03-21 16:57 UTC (permalink / raw)
To: Jan Wieck <[email protected]>; +Cc: Magnus Hagander <[email protected]>; Robins Tharakan <[email protected]>; Peter Eisentraut <[email protected]>; pgsql-hackers
Jan Wieck <[email protected]> writes:
> On 3/20/21 12:39 AM, Jan Wieck wrote:
>> On the way pg_upgrade also mangles the pg_database.datdba
>> (all databases are owned by postgres after an upgrade; will submit a
>> separate patch for that as I consider that a bug by itself).
> Patch attached.
Hmm, doesn't this lose all *other* database-level properties?
I think maybe what we have here is a bug in pg_restore, its
--create switch ought to be trying to update the database's
ownership.
regards, tom lane
^ permalink raw reply [nested|flat] 43+ messages in thread
* Re: Fix pg_upgrade to preserve datdba
2021-03-08 11:02 Re: pg_upgrade failing for 200+ million Large Objects Tharakan, Robins <[email protected]>
2021-03-08 12:33 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 14:13 ` Re: pg_upgrade failing for 200+ million Large Objects Robins Tharakan <[email protected]>
2021-03-08 16:33 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-08 16:35 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 16:58 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 04:39 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-21 16:50 ` Fix pg_upgrade to preserve datdba (was: Re: pg_upgrade failing for 200+ million Large Objects) Jan Wieck <[email protected]>
2021-03-21 16:57 ` Re: Fix pg_upgrade to preserve datdba (was: Re: pg_upgrade failing for 200+ million Large Objects) Tom Lane <[email protected]>
@ 2021-03-21 17:15 ` Jan Wieck <[email protected]>
2021-03-21 17:50 ` Re: Fix pg_upgrade to preserve datdba Jan Wieck <[email protected]>
0 siblings, 1 reply; 43+ messages in thread
From: Jan Wieck @ 2021-03-21 17:15 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: Magnus Hagander <[email protected]>; Robins Tharakan <[email protected]>; Peter Eisentraut <[email protected]>; pgsql-hackers
On 3/21/21 12:57 PM, Tom Lane wrote:
> Jan Wieck <[email protected]> writes:
>> On 3/20/21 12:39 AM, Jan Wieck wrote:
>>> On the way pg_upgrade also mangles the pg_database.datdba
>>> (all databases are owned by postgres after an upgrade; will submit a
>>> separate patch for that as I consider that a bug by itself).
>
>> Patch attached.
>
> Hmm, doesn't this lose all *other* database-level properties?
>
> I think maybe what we have here is a bug in pg_restore, its
> --create switch ought to be trying to update the database's
> ownership.
Possibly. I didn't look into that route.
Regards, Jan
--
Jan Wieck
Principle Database Engineer
Amazon Web Services
^ permalink raw reply [nested|flat] 43+ messages in thread
* Re: Fix pg_upgrade to preserve datdba
2021-03-08 11:02 Re: pg_upgrade failing for 200+ million Large Objects Tharakan, Robins <[email protected]>
2021-03-08 12:33 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 14:13 ` Re: pg_upgrade failing for 200+ million Large Objects Robins Tharakan <[email protected]>
2021-03-08 16:33 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-08 16:35 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 16:58 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 04:39 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-21 16:50 ` Fix pg_upgrade to preserve datdba (was: Re: pg_upgrade failing for 200+ million Large Objects) Jan Wieck <[email protected]>
2021-03-21 16:57 ` Re: Fix pg_upgrade to preserve datdba (was: Re: pg_upgrade failing for 200+ million Large Objects) Tom Lane <[email protected]>
2021-03-21 17:15 ` Re: Fix pg_upgrade to preserve datdba Jan Wieck <[email protected]>
@ 2021-03-21 17:50 ` Jan Wieck <[email protected]>
2021-03-21 18:23 ` Re: Fix pg_upgrade to preserve datdba Tom Lane <[email protected]>
0 siblings, 1 reply; 43+ messages in thread
From: Jan Wieck @ 2021-03-21 17:50 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: Magnus Hagander <[email protected]>; Robins Tharakan <[email protected]>; Peter Eisentraut <[email protected]>; pgsql-hackers
On 3/21/21 1:15 PM, Jan Wieck wrote:
> On 3/21/21 12:57 PM, Tom Lane wrote:
>> Jan Wieck <[email protected]> writes:
>>> On 3/20/21 12:39 AM, Jan Wieck wrote:
>>>> On the way pg_upgrade also mangles the pg_database.datdba
>>>> (all databases are owned by postgres after an upgrade; will submit a
>>>> separate patch for that as I consider that a bug by itself).
>>
>>> Patch attached.
>>
>> Hmm, doesn't this lose all *other* database-level properties?
>>
>> I think maybe what we have here is a bug in pg_restore, its
>> --create switch ought to be trying to update the database's
>> ownership.
>
> Possibly. I didn't look into that route.
Thanks for that. I like this patch a lot better.
Regards, Jan
--
Jan Wieck
Principle Database Engineer
Amazon Web Services
Attachments:
[text/x-patch] pg_restore-preserve-datdba.v1.diff (530B, 2-pg_restore-preserve-datdba.v1.diff)
download | inline diff:
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index f8bec3f..19c1e71 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3030,6 +3030,8 @@ dumpDatabase(Archive *fout)
resetPQExpBuffer(creaQry);
resetPQExpBuffer(delQry);
+ appendPQExpBuffer(creaQry, "ALTER DATABASE %s OWNER TO %s;\n", qdatname, dba);
+
if (strlen(datconnlimit) > 0 && strcmp(datconnlimit, "-1") != 0)
appendPQExpBuffer(creaQry, "ALTER DATABASE %s CONNECTION LIMIT = %s;\n",
qdatname, datconnlimit);
^ permalink raw reply [nested|flat] 43+ messages in thread
* Re: Fix pg_upgrade to preserve datdba
2021-03-08 11:02 Re: pg_upgrade failing for 200+ million Large Objects Tharakan, Robins <[email protected]>
2021-03-08 12:33 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 14:13 ` Re: pg_upgrade failing for 200+ million Large Objects Robins Tharakan <[email protected]>
2021-03-08 16:33 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-08 16:35 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 16:58 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 04:39 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-21 16:50 ` Fix pg_upgrade to preserve datdba (was: Re: pg_upgrade failing for 200+ million Large Objects) Jan Wieck <[email protected]>
2021-03-21 16:57 ` Re: Fix pg_upgrade to preserve datdba (was: Re: pg_upgrade failing for 200+ million Large Objects) Tom Lane <[email protected]>
2021-03-21 17:15 ` Re: Fix pg_upgrade to preserve datdba Jan Wieck <[email protected]>
2021-03-21 17:50 ` Re: Fix pg_upgrade to preserve datdba Jan Wieck <[email protected]>
@ 2021-03-21 18:23 ` Tom Lane <[email protected]>
2021-03-21 18:34 ` Re: Fix pg_upgrade to preserve datdba Tom Lane <[email protected]>
0 siblings, 1 reply; 43+ messages in thread
From: Tom Lane @ 2021-03-21 18:23 UTC (permalink / raw)
To: Jan Wieck <[email protected]>; +Cc: Magnus Hagander <[email protected]>; Robins Tharakan <[email protected]>; Peter Eisentraut <[email protected]>; pgsql-hackers
Jan Wieck <[email protected]> writes:
>> On 3/21/21 12:57 PM, Tom Lane wrote:
>>> I think maybe what we have here is a bug in pg_restore, its
>>> --create switch ought to be trying to update the database's
>>> ownership.
> Thanks for that. I like this patch a lot better.
Needs a little more work than that --- we should allow it to respond
to the --no-owner switch, for example. But I think likely we can do
it where other object ownership is handled. I'll look in a bit.
regards, tom lane
^ permalink raw reply [nested|flat] 43+ messages in thread
* Re: Fix pg_upgrade to preserve datdba
2021-03-08 11:02 Re: pg_upgrade failing for 200+ million Large Objects Tharakan, Robins <[email protected]>
2021-03-08 12:33 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 14:13 ` Re: pg_upgrade failing for 200+ million Large Objects Robins Tharakan <[email protected]>
2021-03-08 16:33 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-08 16:35 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 16:58 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 04:39 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-21 16:50 ` Fix pg_upgrade to preserve datdba (was: Re: pg_upgrade failing for 200+ million Large Objects) Jan Wieck <[email protected]>
2021-03-21 16:57 ` Re: Fix pg_upgrade to preserve datdba (was: Re: pg_upgrade failing for 200+ million Large Objects) Tom Lane <[email protected]>
2021-03-21 17:15 ` Re: Fix pg_upgrade to preserve datdba Jan Wieck <[email protected]>
2021-03-21 17:50 ` Re: Fix pg_upgrade to preserve datdba Jan Wieck <[email protected]>
2021-03-21 18:23 ` Re: Fix pg_upgrade to preserve datdba Tom Lane <[email protected]>
@ 2021-03-21 18:34 ` Tom Lane <[email protected]>
2021-03-21 19:29 ` Re: Fix pg_upgrade to preserve datdba Tom Lane <[email protected]>
2021-03-21 19:36 ` Re: Fix pg_upgrade to preserve datdba Jan Wieck <[email protected]>
0 siblings, 2 replies; 43+ messages in thread
From: Tom Lane @ 2021-03-21 18:34 UTC (permalink / raw)
To: Jan Wieck <[email protected]>; +Cc: Magnus Hagander <[email protected]>; Robins Tharakan <[email protected]>; Peter Eisentraut <[email protected]>; pgsql-hackers
I wrote:
> Needs a little more work than that --- we should allow it to respond
> to the --no-owner switch, for example. But I think likely we can do
> it where other object ownership is handled. I'll look in a bit.
Actually ... said code already DOES do that, so now I'm confused.
I tried
regression=# create user joe;
CREATE ROLE
regression=# create database joe owner joe;
CREATE DATABASE
regression=# \q
$ pg_dump -Fc joe >joe.dump
$ pg_restore --create -f - joe.dump | more
and I see
--
-- Name: joe; Type: DATABASE; Schema: -; Owner: joe
--
CREATE DATABASE joe WITH TEMPLATE = template0 ENCODING = 'SQL_ASCII' LOCALE = 'C';
ALTER DATABASE joe OWNER TO joe;
so at least in this case it's doing the right thing. We need a bit
more detail about the context in which it's doing the wrong thing
for you.
regards, tom lane
^ permalink raw reply [nested|flat] 43+ messages in thread
* Re: Fix pg_upgrade to preserve datdba
2021-03-08 11:02 Re: pg_upgrade failing for 200+ million Large Objects Tharakan, Robins <[email protected]>
2021-03-08 12:33 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 14:13 ` Re: pg_upgrade failing for 200+ million Large Objects Robins Tharakan <[email protected]>
2021-03-08 16:33 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-08 16:35 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 16:58 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 04:39 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-21 16:50 ` Fix pg_upgrade to preserve datdba (was: Re: pg_upgrade failing for 200+ million Large Objects) Jan Wieck <[email protected]>
2021-03-21 16:57 ` Re: Fix pg_upgrade to preserve datdba (was: Re: pg_upgrade failing for 200+ million Large Objects) Tom Lane <[email protected]>
2021-03-21 17:15 ` Re: Fix pg_upgrade to preserve datdba Jan Wieck <[email protected]>
2021-03-21 17:50 ` Re: Fix pg_upgrade to preserve datdba Jan Wieck <[email protected]>
2021-03-21 18:23 ` Re: Fix pg_upgrade to preserve datdba Tom Lane <[email protected]>
2021-03-21 18:34 ` Re: Fix pg_upgrade to preserve datdba Tom Lane <[email protected]>
@ 2021-03-21 19:29 ` Tom Lane <[email protected]>
1 sibling, 0 replies; 43+ messages in thread
From: Tom Lane @ 2021-03-21 19:29 UTC (permalink / raw)
To: Jan Wieck <[email protected]>; +Cc: Magnus Hagander <[email protected]>; Robins Tharakan <[email protected]>; Peter Eisentraut <[email protected]>; pgsql-hackers
I wrote:
> ... so at least in this case it's doing the right thing. We need a bit
> more detail about the context in which it's doing the wrong thing
> for you.
Just to cross-check, I tried modifying pg_upgrade's regression test
as attached, and it still passes. (And inspection of the leftover
dump2.sql file verifies that the database ownership was correct.)
So I'm not sure what's up here.
regards, tom lane
Attachments:
[text/x-diff] upgrade-test-dbownership.patch (1.6K, 2-upgrade-test-dbownership.patch)
download | inline diff:
diff --git a/src/bin/pg_upgrade/test.sh b/src/bin/pg_upgrade/test.sh
index 9c6deae294..436646b5ba 100644
--- a/src/bin/pg_upgrade/test.sh
+++ b/src/bin/pg_upgrade/test.sh
@@ -150,6 +150,9 @@ export EXTRA_REGRESS_OPTS
standard_initdb "$oldbindir"/initdb
"$oldbindir"/pg_ctl start -l "$logdir/postmaster1.log" -o "$POSTMASTER_OPTS" -w
+# Create another user (just to exercise database ownership restoration).
+createuser regression_dbowner || createdb_status=$?
+
# Create databases with names covering the ASCII bytes other than NUL, BEL,
# LF, or CR. BEL would ring the terminal bell in the course of this test, and
# it is not otherwise a special case. PostgreSQL doesn't support the rest.
@@ -160,7 +163,7 @@ dbname1='\"\'$dbname1'\\"\\\'
dbname2=`awk 'BEGIN { for (i = 46; i < 91; i++) printf "%c", i }' </dev/null`
dbname3=`awk 'BEGIN { for (i = 91; i < 128; i++) printf "%c", i }' </dev/null`
createdb "regression$dbname1" || createdb_status=$?
-createdb "regression$dbname2" || createdb_status=$?
+createdb --owner=regression_dbowner "regression$dbname2" || createdb_status=$?
createdb "regression$dbname3" || createdb_status=$?
if "$MAKE" -C "$oldsrc" installcheck-parallel; then
@@ -227,7 +230,7 @@ PGDATA="$BASE_PGDATA"
standard_initdb 'initdb'
-pg_upgrade $PG_UPGRADE_OPTS -d "${PGDATA}.old" -D "$PGDATA" -b "$oldbindir" -p "$PGPORT" -P "$PGPORT"
+pg_upgrade $PG_UPGRADE_OPTS -d "${PGDATA}.old" -D "$PGDATA" -b "$oldbindir" -p "$PGPORT" -P "$PGPORT" -j 4
# make sure all directories and files have group permissions, on Unix hosts
# Windows hosts don't support Unix-y permissions.
^ permalink raw reply [nested|flat] 43+ messages in thread
* Re: Fix pg_upgrade to preserve datdba
2021-03-08 11:02 Re: pg_upgrade failing for 200+ million Large Objects Tharakan, Robins <[email protected]>
2021-03-08 12:33 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 14:13 ` Re: pg_upgrade failing for 200+ million Large Objects Robins Tharakan <[email protected]>
2021-03-08 16:33 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-08 16:35 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 16:58 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 04:39 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-21 16:50 ` Fix pg_upgrade to preserve datdba (was: Re: pg_upgrade failing for 200+ million Large Objects) Jan Wieck <[email protected]>
2021-03-21 16:57 ` Re: Fix pg_upgrade to preserve datdba (was: Re: pg_upgrade failing for 200+ million Large Objects) Tom Lane <[email protected]>
2021-03-21 17:15 ` Re: Fix pg_upgrade to preserve datdba Jan Wieck <[email protected]>
2021-03-21 17:50 ` Re: Fix pg_upgrade to preserve datdba Jan Wieck <[email protected]>
2021-03-21 18:23 ` Re: Fix pg_upgrade to preserve datdba Tom Lane <[email protected]>
2021-03-21 18:34 ` Re: Fix pg_upgrade to preserve datdba Tom Lane <[email protected]>
@ 2021-03-21 19:36 ` Jan Wieck <[email protected]>
2021-03-21 19:56 ` Re: Fix pg_upgrade to preserve datdba Tom Lane <[email protected]>
1 sibling, 1 reply; 43+ messages in thread
From: Jan Wieck @ 2021-03-21 19:36 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: Magnus Hagander <[email protected]>; Robins Tharakan <[email protected]>; Peter Eisentraut <[email protected]>; pgsql-hackers
On 3/21/21 2:34 PM, Tom Lane wrote:
> and I see
>
> --
> -- Name: joe; Type: DATABASE; Schema: -; Owner: joe
> --
>
> CREATE DATABASE joe WITH TEMPLATE = template0 ENCODING = 'SQL_ASCII' LOCALE = 'C';
>
>
> ALTER DATABASE joe OWNER TO joe;
>
> so at least in this case it's doing the right thing. We need a bit
> more detail about the context in which it's doing the wrong thing
> for you.
After moving all of this to a pristine postgresql.org based repo I see
the same. My best guess at this point is that the permission hoops, that
RDS and Aurora PostgreSQL are jumping through, was messing with this.
But that has nothing to do with the actual topic.
So let's focus on the actual problem of running out of XIDs and memory
while doing the upgrade involving millions of small large objects.
Regards, Jan
--
Jan Wieck
Principle Database Engineer
Amazon Web Services
^ permalink raw reply [nested|flat] 43+ messages in thread
* Re: Fix pg_upgrade to preserve datdba
2021-03-08 11:02 Re: pg_upgrade failing for 200+ million Large Objects Tharakan, Robins <[email protected]>
2021-03-08 12:33 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 14:13 ` Re: pg_upgrade failing for 200+ million Large Objects Robins Tharakan <[email protected]>
2021-03-08 16:33 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-08 16:35 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 16:58 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 04:39 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-21 16:50 ` Fix pg_upgrade to preserve datdba (was: Re: pg_upgrade failing for 200+ million Large Objects) Jan Wieck <[email protected]>
2021-03-21 16:57 ` Re: Fix pg_upgrade to preserve datdba (was: Re: pg_upgrade failing for 200+ million Large Objects) Tom Lane <[email protected]>
2021-03-21 17:15 ` Re: Fix pg_upgrade to preserve datdba Jan Wieck <[email protected]>
2021-03-21 17:50 ` Re: Fix pg_upgrade to preserve datdba Jan Wieck <[email protected]>
2021-03-21 18:23 ` Re: Fix pg_upgrade to preserve datdba Tom Lane <[email protected]>
2021-03-21 18:34 ` Re: Fix pg_upgrade to preserve datdba Tom Lane <[email protected]>
2021-03-21 19:36 ` Re: Fix pg_upgrade to preserve datdba Jan Wieck <[email protected]>
@ 2021-03-21 19:56 ` Tom Lane <[email protected]>
2021-03-22 18:07 ` Re: Fix pg_upgrade to preserve datdba Jan Wieck <[email protected]>
0 siblings, 1 reply; 43+ messages in thread
From: Tom Lane @ 2021-03-21 19:56 UTC (permalink / raw)
To: Jan Wieck <[email protected]>; +Cc: Magnus Hagander <[email protected]>; Robins Tharakan <[email protected]>; Peter Eisentraut <[email protected]>; pgsql-hackers
Jan Wieck <[email protected]> writes:
> So let's focus on the actual problem of running out of XIDs and memory
> while doing the upgrade involving millions of small large objects.
Right. So as far as --single-transaction vs. --create goes, that's
mostly a definitional problem. As long as the contents of a DB are
restored in one transaction, it's not gonna matter if we eat one or
two more XIDs while creating the DB itself. So we could either
relax pg_restore's complaint, or invent a different switch that's
named to acknowledge that it's not really only one transaction.
That still leaves us with the lots-o-locks problem. However, once
we've crossed the Rubicon of "it's not really only one transaction",
you could imagine that the switch is "--fewer-transactions", and the
idea is for pg_restore to commit after every (say) 100000 operations.
That would both bound its lock requirements and greatly cut its XID
consumption.
The work you described sounded like it could fit into that paradigm,
with the additional ability to run some parallel restore tasks
that are each consuming a bounded number of locks.
regards, tom lane
^ permalink raw reply [nested|flat] 43+ messages in thread
* Re: Fix pg_upgrade to preserve datdba
2021-03-08 11:02 Re: pg_upgrade failing for 200+ million Large Objects Tharakan, Robins <[email protected]>
2021-03-08 12:33 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 14:13 ` Re: pg_upgrade failing for 200+ million Large Objects Robins Tharakan <[email protected]>
2021-03-08 16:33 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-08 16:35 ` Re: pg_upgrade failing for 200+ million Large Objects Magnus Hagander <[email protected]>
2021-03-08 16:58 ` Re: pg_upgrade failing for 200+ million Large Objects Tom Lane <[email protected]>
2021-03-20 04:39 ` Re: pg_upgrade failing for 200+ million Large Objects Jan Wieck <[email protected]>
2021-03-21 16:50 ` Fix pg_upgrade to preserve datdba (was: Re: pg_upgrade failing for 200+ million Large Objects) Jan Wieck <[email protected]>
2021-03-21 16:57 ` Re: Fix pg_upgrade to preserve datdba (was: Re: pg_upgrade failing for 200+ million Large Objects) Tom Lane <[email protected]>
2021-03-21 17:15 ` Re: Fix pg_upgrade to preserve datdba Jan Wieck <[email protected]>
2021-03-21 17:50 ` Re: Fix pg_upgrade to preserve datdba Jan Wieck <[email protected]>
2021-03-21 18:23 ` Re: Fix pg_upgrade to preserve datdba Tom Lane <[email protected]>
2021-03-21 18:34 ` Re: Fix pg_upgrade to preserve datdba Tom Lane <[email protected]>
2021-03-21 19:36 ` Re: Fix pg_upgrade to preserve datdba Jan Wieck <[email protected]>
2021-03-21 19:56 ` Re: Fix pg_upgrade to preserve datdba Tom Lane <[email protected]>
@ 2021-03-22 18:07 ` Jan Wieck <[email protected]>
0 siblings, 0 replies; 43+ messages in thread
From: Jan Wieck @ 2021-03-22 18:07 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: Magnus Hagander <[email protected]>; Robins Tharakan <[email protected]>; Peter Eisentraut <[email protected]>; pgsql-hackers
On 3/21/21 3:56 PM, Tom Lane wrote:
> Jan Wieck <[email protected]> writes:
>> So let's focus on the actual problem of running out of XIDs and memory
>> while doing the upgrade involving millions of small large objects.
>
> Right. So as far as --single-transaction vs. --create goes, that's
> mostly a definitional problem. As long as the contents of a DB are
> restored in one transaction, it's not gonna matter if we eat one or
> two more XIDs while creating the DB itself. So we could either
> relax pg_restore's complaint, or invent a different switch that's
> named to acknowledge that it's not really only one transaction.
>
> That still leaves us with the lots-o-locks problem. However, once
> we've crossed the Rubicon of "it's not really only one transaction",
> you could imagine that the switch is "--fewer-transactions", and the
> idea is for pg_restore to commit after every (say) 100000 operations.
> That would both bound its lock requirements and greatly cut its XID
> consumption.
It leaves us with three things.
1) tremendous amounts of locks
2) tremendous amounts of memory needed
3) taking forever because it is single threaded.
I created a pathological case here on a VM with 24GB of RAM, 80GB of
SWAP sitting on NVME. The database has 20 million large objects, each of
which has 2 GRANTS, 1 COMMENT and 1 SECURITY LABEL (dummy). Each LO only
contains a string "large object <oid>", so the whole database in 9.5 is
about 15GB in size.
A stock pg_upgrade to version 14devel using --link takes about 15 hours.
This is partly because the pg_dump and pg_restore both grow to something
like 50GB+ to hold the TOC. Which sounds out of touch considering that
the entire system catalog on disk is less than 15GB. But aside from the
ridiculous amount of swapping, the whole thing also suffers from
consuming about 80 million transactions and apparently having just as
many network round trips with a single client.
>
> The work you described sounded like it could fit into that paradigm,
> with the additional ability to run some parallel restore tasks
> that are each consuming a bounded number of locks.
I have attached a POC patch that implements two new options for pg_upgrade.
--restore-jobs=NUM --jobs parameter passed to pg_restore
--restore-blob-batch-size=NUM number of blobs restored in one xact
It does a bit more than just that. It rearranges the way large objects
are dumped so that most of the commands are all in one TOC entry and the
entry is emitted into SECTION_DATA when in binary upgrade mode (which
guarantees that there isn't any actual BLOB data in the dump). This
greatly reduces the number of network round trips and when using 8
parallel restore jobs, almost saturates the 4-core VM. Reducing the
number of TOC entries also reduces the total virtual memory need of
pg_restore to 15G, so there is a lot less swapping going on.
It cuts down the pg_upgrade time from 15 hours to 1.5 hours. In that run
I used --restore-jobs=8 and --restore-blob-batch-size=10000 (with a
max_locks_per_transaction=12000).
As said, this isn't a "one size fits all" solution. The pg_upgrade
parameters for --jobs and --restore-jobs will really depend on the
situation. Hundreds of small databases want --jobs, but one database
with millions of large objects wants --restore-jobs.
Regards, Jan
--
Jan Wieck
Principle Database Engineer
Amazon Web Services
Attachments:
[text/x-patch] pg_upgrade_improvements.v2.diff (22.9K, 2-pg_upgrade_improvements.v2.diff)
download | inline diff:
diff --git a/src/bin/pg_dump/parallel.c b/src/bin/pg_dump/parallel.c
index c7351a4..4a611d0 100644
--- a/src/bin/pg_dump/parallel.c
+++ b/src/bin/pg_dump/parallel.c
@@ -864,6 +864,11 @@ RunWorker(ArchiveHandle *AH, ParallelSlot *slot)
WaitForCommands(AH, pipefd);
/*
+ * Close an eventually open BLOB batch transaction.
+ */
+ CommitBlobTransaction((Archive *)AH);
+
+ /*
* Disconnect from database and clean up.
*/
set_cancel_slot_archive(slot, NULL);
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index 0296b9b..cd8a590 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -203,6 +203,8 @@ typedef struct Archive
int numWorkers; /* number of parallel processes */
char *sync_snapshot_id; /* sync snapshot id for parallel operation */
+ int blobBatchSize; /* # of blobs to restore per transaction */
+
/* info needed for string escaping */
int encoding; /* libpq code for client_encoding */
bool std_strings; /* standard_conforming_strings */
@@ -269,6 +271,7 @@ extern void WriteData(Archive *AH, const void *data, size_t dLen);
extern int StartBlob(Archive *AH, Oid oid);
extern int EndBlob(Archive *AH, Oid oid);
+extern void CommitBlobTransaction(Archive *AH);
extern void CloseArchive(Archive *AH);
extern void SetArchiveOptions(Archive *AH, DumpOptions *dopt, RestoreOptions *ropt);
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 1f82c64..51a862a 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -68,6 +68,8 @@ typedef struct _parallelReadyList
bool sorted; /* are valid entries currently sorted? */
} ParallelReadyList;
+static int blobBatchCount = 0;
+static bool blobInXact = false;
static ArchiveHandle *_allocAH(const char *FileSpec, const ArchiveFormat fmt,
const int compression, bool dosync, ArchiveMode mode,
@@ -265,6 +267,8 @@ CloseArchive(Archive *AHX)
int res = 0;
ArchiveHandle *AH = (ArchiveHandle *) AHX;
+ CommitBlobTransaction(AHX);
+
AH->ClosePtr(AH);
/* Close the output */
@@ -279,6 +283,24 @@ CloseArchive(Archive *AHX)
/* Public */
void
+CommitBlobTransaction(Archive *AHX)
+{
+ ArchiveHandle *AH = (ArchiveHandle *) AHX;
+
+ if (blobInXact)
+ {
+ ahprintf(AH, "--\n");
+ ahprintf(AH, "-- End BLOB restore batch\n");
+ ahprintf(AH, "--\n");
+ ahprintf(AH, "COMMIT;\n\n");
+
+ blobBatchCount = 0;
+ blobInXact = false;
+ }
+}
+
+/* Public */
+void
SetArchiveOptions(Archive *AH, DumpOptions *dopt, RestoreOptions *ropt)
{
/* Caller can omit dump options, in which case we synthesize them */
@@ -3531,6 +3553,59 @@ _printTocEntry(ArchiveHandle *AH, TocEntry *te, bool isData)
{
RestoreOptions *ropt = AH->public.ropt;
+ /* We restore BLOBs in batches to reduce XID consumption */
+ if (strcmp(te->desc, "BLOB") == 0 && AH->public.blobBatchSize > 0)
+ {
+ if (blobInXact)
+ {
+ /* We are inside a BLOB restore transaction */
+ if (blobBatchCount >= AH->public.blobBatchSize)
+ {
+ /*
+ * We did reach the batch size with the previous BLOB.
+ * Commit and start a new batch.
+ */
+ ahprintf(AH, "--\n");
+ ahprintf(AH, "-- BLOB batch size reached\n");
+ ahprintf(AH, "--\n");
+ ahprintf(AH, "COMMIT;\n");
+ ahprintf(AH, "BEGIN;\n\n");
+
+ blobBatchCount = 1;
+ }
+ else
+ {
+ /* This one still fits into the current batch */
+ blobBatchCount++;
+ }
+ }
+ else
+ {
+ /* Not inside a transaction, start a new batch */
+ ahprintf(AH, "--\n");
+ ahprintf(AH, "-- Start BLOB restore batch\n");
+ ahprintf(AH, "--\n");
+ ahprintf(AH, "BEGIN;\n\n");
+
+ blobBatchCount = 1;
+ blobInXact = true;
+ }
+ }
+ else
+ {
+ /* Not a BLOB. If we have a BLOB batch open, close it. */
+ if (blobInXact)
+ {
+ ahprintf(AH, "--\n");
+ ahprintf(AH, "-- End BLOB restore batch\n");
+ ahprintf(AH, "--\n");
+ ahprintf(AH, "COMMIT;\n\n");
+
+ blobBatchCount = 0;
+ blobInXact = false;
+ }
+ }
+
/* Select owner, schema, tablespace and default AM as necessary */
_becomeOwner(AH, te);
_selectOutputSchema(AH, te->namespace);
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index f8bec3f..f153f08 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -165,12 +165,20 @@ static void guessConstraintInheritance(TableInfo *tblinfo, int numTables);
static void dumpComment(Archive *fout, const char *type, const char *name,
const char *namespace, const char *owner,
CatalogId catalogId, int subid, DumpId dumpId);
+static bool dumpCommentQuery(Archive *fout, PQExpBuffer query, PQExpBuffer tag,
+ const char *type, const char *name,
+ const char *namespace, const char *owner,
+ CatalogId catalogId, int subid, DumpId dumpId);
static int findComments(Archive *fout, Oid classoid, Oid objoid,
CommentItem **items);
static int collectComments(Archive *fout, CommentItem **items);
static void dumpSecLabel(Archive *fout, const char *type, const char *name,
const char *namespace, const char *owner,
CatalogId catalogId, int subid, DumpId dumpId);
+static bool dumpSecLabelQuery(Archive *fout, PQExpBuffer query, PQExpBuffer tag,
+ const char *type, const char *name,
+ const char *namespace, const char *owner,
+ CatalogId catalogId, int subid, DumpId dumpId);
static int findSecLabels(Archive *fout, Oid classoid, Oid objoid,
SecLabelItem **items);
static int collectSecLabels(Archive *fout, SecLabelItem **items);
@@ -227,6 +235,13 @@ static DumpId dumpACL(Archive *fout, DumpId objDumpId, DumpId altDumpId,
const char *nspname, const char *owner,
const char *acls, const char *racls,
const char *initacls, const char *initracls);
+static bool dumpACLQuery(Archive *fout, PQExpBuffer query, PQExpBuffer tag,
+ DumpId objDumpId, DumpId altDumpId,
+ const char *type, const char *name,
+ const char *subname,
+ const char *nspname, const char *owner,
+ const char *acls, const char *racls,
+ const char *initacls, const char *initracls);
static void getDependencies(Archive *fout);
static void BuildArchiveDependencies(Archive *fout);
@@ -3468,11 +3483,44 @@ dumpBlob(Archive *fout, const BlobInfo *binfo)
{
PQExpBuffer cquery = createPQExpBuffer();
PQExpBuffer dquery = createPQExpBuffer();
+ PQExpBuffer tag = createPQExpBuffer();
+ teSection section = SECTION_PRE_DATA;
appendPQExpBuffer(cquery,
"SELECT pg_catalog.lo_create('%s');\n",
binfo->dobj.name);
+ /*
+ * In binary upgrade mode we put all the queries to restore
+ * one large object into a single TOC entry and emit it as
+ * SECTION_DATA so that they can be restored in parallel.
+ */
+ if (fout->dopt->binary_upgrade)
+ {
+ section = SECTION_DATA;
+
+ /* Dump comment if any */
+ if (binfo->dobj.dump & DUMP_COMPONENT_COMMENT)
+ dumpCommentQuery(fout, cquery, tag, "LARGE OBJECT",
+ binfo->dobj.name, NULL, binfo->rolname,
+ binfo->dobj.catId, 0, binfo->dobj.dumpId);
+
+ /* Dump security label if any */
+ if (binfo->dobj.dump & DUMP_COMPONENT_SECLABEL)
+ dumpSecLabelQuery(fout, cquery, tag, "LARGE OBJECT",
+ binfo->dobj.name,
+ NULL, binfo->rolname,
+ binfo->dobj.catId, 0, binfo->dobj.dumpId);
+
+ /* Dump ACL if any */
+ if (binfo->blobacl && (binfo->dobj.dump & DUMP_COMPONENT_ACL))
+ dumpACLQuery(fout, cquery, tag,
+ binfo->dobj.dumpId, InvalidDumpId, "LARGE OBJECT",
+ binfo->dobj.name, NULL,
+ NULL, binfo->rolname, binfo->blobacl, binfo->rblobacl,
+ binfo->initblobacl, binfo->initrblobacl);
+ }
+
appendPQExpBuffer(dquery,
"SELECT pg_catalog.lo_unlink('%s');\n",
binfo->dobj.name);
@@ -3482,28 +3530,31 @@ dumpBlob(Archive *fout, const BlobInfo *binfo)
ARCHIVE_OPTS(.tag = binfo->dobj.name,
.owner = binfo->rolname,
.description = "BLOB",
- .section = SECTION_PRE_DATA,
+ .section = section,
.createStmt = cquery->data,
.dropStmt = dquery->data));
- /* Dump comment if any */
- if (binfo->dobj.dump & DUMP_COMPONENT_COMMENT)
- dumpComment(fout, "LARGE OBJECT", binfo->dobj.name,
- NULL, binfo->rolname,
- binfo->dobj.catId, 0, binfo->dobj.dumpId);
-
- /* Dump security label if any */
- if (binfo->dobj.dump & DUMP_COMPONENT_SECLABEL)
- dumpSecLabel(fout, "LARGE OBJECT", binfo->dobj.name,
- NULL, binfo->rolname,
- binfo->dobj.catId, 0, binfo->dobj.dumpId);
-
- /* Dump ACL if any */
- if (binfo->blobacl && (binfo->dobj.dump & DUMP_COMPONENT_ACL))
- dumpACL(fout, binfo->dobj.dumpId, InvalidDumpId, "LARGE OBJECT",
- binfo->dobj.name, NULL,
- NULL, binfo->rolname, binfo->blobacl, binfo->rblobacl,
- binfo->initblobacl, binfo->initrblobacl);
+ if (!fout->dopt->binary_upgrade)
+ {
+ /* Dump comment if any */
+ if (binfo->dobj.dump & DUMP_COMPONENT_COMMENT)
+ dumpComment(fout, "LARGE OBJECT", binfo->dobj.name,
+ NULL, binfo->rolname,
+ binfo->dobj.catId, 0, binfo->dobj.dumpId);
+
+ /* Dump security label if any */
+ if (binfo->dobj.dump & DUMP_COMPONENT_SECLABEL)
+ dumpSecLabel(fout, "LARGE OBJECT", binfo->dobj.name,
+ NULL, binfo->rolname,
+ binfo->dobj.catId, 0, binfo->dobj.dumpId);
+
+ /* Dump ACL if any */
+ if (binfo->blobacl && (binfo->dobj.dump & DUMP_COMPONENT_ACL))
+ dumpACL(fout, binfo->dobj.dumpId, InvalidDumpId, "LARGE OBJECT",
+ binfo->dobj.name, NULL,
+ NULL, binfo->rolname, binfo->blobacl, binfo->rblobacl,
+ binfo->initblobacl, binfo->initrblobacl);
+ }
destroyPQExpBuffer(cquery);
destroyPQExpBuffer(dquery);
@@ -9868,25 +9919,56 @@ dumpComment(Archive *fout, const char *type, const char *name,
const char *namespace, const char *owner,
CatalogId catalogId, int subid, DumpId dumpId)
{
+ PQExpBuffer query = createPQExpBuffer();
+ PQExpBuffer tag = createPQExpBuffer();
+
+ if (dumpCommentQuery(fout, query, tag, type, name, namespace, owner,
+ catalogId, subid, dumpId))
+ {
+ /*
+ * We mark comments as SECTION_NONE because they really belong in the
+ * same section as their parent, whether that is pre-data or
+ * post-data.
+ */
+ ArchiveEntry(fout, nilCatalogId, createDumpId(),
+ ARCHIVE_OPTS(.tag = tag->data,
+ .namespace = namespace,
+ .owner = owner,
+ .description = "COMMENT",
+ .section = SECTION_NONE,
+ .createStmt = query->data,
+ .deps = &dumpId,
+ .nDeps = 1));
+ }
+ destroyPQExpBuffer(query);
+ destroyPQExpBuffer(tag);
+}
+
+static bool
+dumpCommentQuery(Archive *fout, PQExpBuffer query, PQExpBuffer tag,
+ const char *type, const char *name,
+ const char *namespace, const char *owner,
+ CatalogId catalogId, int subid, DumpId dumpId)
+{
DumpOptions *dopt = fout->dopt;
CommentItem *comments;
int ncomments;
/* do nothing, if --no-comments is supplied */
if (dopt->no_comments)
- return;
+ return false;
/* Comments are schema not data ... except blob comments are data */
if (strcmp(type, "LARGE OBJECT") != 0)
{
if (dopt->dataOnly)
- return;
+ return false;
}
else
{
/* We do dump blob comments in binary-upgrade mode */
if (dopt->schemaOnly && !dopt->binary_upgrade)
- return;
+ return false;
}
/* Search for comments associated with catalogId, using table */
@@ -9905,9 +9987,6 @@ dumpComment(Archive *fout, const char *type, const char *name,
/* If a comment exists, build COMMENT ON statement */
if (ncomments > 0)
{
- PQExpBuffer query = createPQExpBuffer();
- PQExpBuffer tag = createPQExpBuffer();
-
appendPQExpBuffer(query, "COMMENT ON %s ", type);
if (namespace && *namespace)
appendPQExpBuffer(query, "%s.", fmtId(namespace));
@@ -9917,24 +9996,10 @@ dumpComment(Archive *fout, const char *type, const char *name,
appendPQExpBuffer(tag, "%s %s", type, name);
- /*
- * We mark comments as SECTION_NONE because they really belong in the
- * same section as their parent, whether that is pre-data or
- * post-data.
- */
- ArchiveEntry(fout, nilCatalogId, createDumpId(),
- ARCHIVE_OPTS(.tag = tag->data,
- .namespace = namespace,
- .owner = owner,
- .description = "COMMENT",
- .section = SECTION_NONE,
- .createStmt = query->data,
- .deps = &dumpId,
- .nDeps = 1));
-
- destroyPQExpBuffer(query);
- destroyPQExpBuffer(tag);
+ return true;
}
+
+ return false;
}
/*
@@ -15070,18 +15135,63 @@ dumpACL(Archive *fout, DumpId objDumpId, DumpId altDumpId,
const char *initacls, const char *initracls)
{
DumpId aclDumpId = InvalidDumpId;
+ PQExpBuffer query = createPQExpBuffer();
+ PQExpBuffer tag = createPQExpBuffer();
+
+ if (dumpACLQuery(fout, query, tag, objDumpId, altDumpId,
+ type, name, subname, nspname, owner,
+ acls, racls, initacls, initracls))
+ {
+ DumpId aclDeps[2];
+ int nDeps = 0;
+
+ if (subname)
+ appendPQExpBuffer(tag, "COLUMN %s.%s", name, subname);
+ else
+ appendPQExpBuffer(tag, "%s %s", type, name);
+
+ aclDeps[nDeps++] = objDumpId;
+ if (altDumpId != InvalidDumpId)
+ aclDeps[nDeps++] = altDumpId;
+
+ aclDumpId = createDumpId();
+
+ ArchiveEntry(fout, nilCatalogId, aclDumpId,
+ ARCHIVE_OPTS(.tag = tag->data,
+ .namespace = nspname,
+ .owner = owner,
+ .description = "ACL",
+ .section = SECTION_NONE,
+ .createStmt = query->data,
+ .deps = aclDeps,
+ .nDeps = nDeps));
+
+ }
+
+ destroyPQExpBuffer(query);
+ destroyPQExpBuffer(tag);
+
+ return aclDumpId;
+}
+
+static bool
+dumpACLQuery(Archive *fout, PQExpBuffer query, PQExpBuffer tag,
+ DumpId objDumpId, DumpId altDumpId,
+ const char *type, const char *name, const char *subname,
+ const char *nspname, const char *owner,
+ const char *acls, const char *racls,
+ const char *initacls, const char *initracls)
+{
DumpOptions *dopt = fout->dopt;
- PQExpBuffer sql;
+ bool haveACL = false;
/* Do nothing if ACL dump is not enabled */
if (dopt->aclsSkip)
- return InvalidDumpId;
+ return false;
/* --data-only skips ACLs *except* BLOB ACLs */
if (dopt->dataOnly && strcmp(type, "LARGE OBJECT") != 0)
- return InvalidDumpId;
-
- sql = createPQExpBuffer();
+ return false;
/*
* Check to see if this object has had any initial ACLs included for it.
@@ -15093,54 +15203,31 @@ dumpACL(Archive *fout, DumpId objDumpId, DumpId altDumpId,
*/
if (strlen(initacls) != 0 || strlen(initracls) != 0)
{
- appendPQExpBufferStr(sql, "SELECT pg_catalog.binary_upgrade_set_record_init_privs(true);\n");
+ haveACL = true;
+ appendPQExpBufferStr(query, "SELECT pg_catalog.binary_upgrade_set_record_init_privs(true);\n");
if (!buildACLCommands(name, subname, nspname, type,
initacls, initracls, owner,
- "", fout->remoteVersion, sql))
+ "", fout->remoteVersion, query))
fatal("could not parse initial GRANT ACL list (%s) or initial REVOKE ACL list (%s) for object \"%s\" (%s)",
initacls, initracls, name, type);
- appendPQExpBufferStr(sql, "SELECT pg_catalog.binary_upgrade_set_record_init_privs(false);\n");
+ appendPQExpBufferStr(query, "SELECT pg_catalog.binary_upgrade_set_record_init_privs(false);\n");
}
if (!buildACLCommands(name, subname, nspname, type,
acls, racls, owner,
- "", fout->remoteVersion, sql))
+ "", fout->remoteVersion, query))
fatal("could not parse GRANT ACL list (%s) or REVOKE ACL list (%s) for object \"%s\" (%s)",
acls, racls, name, type);
- if (sql->len > 0)
+ if (haveACL && tag != NULL)
{
- PQExpBuffer tag = createPQExpBuffer();
- DumpId aclDeps[2];
- int nDeps = 0;
-
if (subname)
appendPQExpBuffer(tag, "COLUMN %s.%s", name, subname);
else
appendPQExpBuffer(tag, "%s %s", type, name);
-
- aclDeps[nDeps++] = objDumpId;
- if (altDumpId != InvalidDumpId)
- aclDeps[nDeps++] = altDumpId;
-
- aclDumpId = createDumpId();
-
- ArchiveEntry(fout, nilCatalogId, aclDumpId,
- ARCHIVE_OPTS(.tag = tag->data,
- .namespace = nspname,
- .owner = owner,
- .description = "ACL",
- .section = SECTION_NONE,
- .createStmt = sql->data,
- .deps = aclDeps,
- .nDeps = nDeps));
-
- destroyPQExpBuffer(tag);
}
- destroyPQExpBuffer(sql);
-
- return aclDumpId;
+ return haveACL;
}
/*
@@ -15166,34 +15253,58 @@ dumpSecLabel(Archive *fout, const char *type, const char *name,
const char *namespace, const char *owner,
CatalogId catalogId, int subid, DumpId dumpId)
{
+ PQExpBuffer query = createPQExpBuffer();
+ PQExpBuffer tag = createPQExpBuffer();
+
+ if (dumpSecLabelQuery(fout, query, tag, type, name,
+ namespace, owner, catalogId, subid, dumpId))
+ {
+ ArchiveEntry(fout, nilCatalogId, createDumpId(),
+ ARCHIVE_OPTS(.tag = tag->data,
+ .namespace = namespace,
+ .owner = owner,
+ .description = "SECURITY LABEL",
+ .section = SECTION_NONE,
+ .createStmt = query->data,
+ .deps = &dumpId,
+ .nDeps = 1));
+ }
+
+ destroyPQExpBuffer(query);
+ destroyPQExpBuffer(tag);
+}
+
+static bool
+dumpSecLabelQuery(Archive *fout, PQExpBuffer query, PQExpBuffer tag,
+ const char *type, const char *name,
+ const char *namespace, const char *owner,
+ CatalogId catalogId, int subid, DumpId dumpId)
+{
DumpOptions *dopt = fout->dopt;
SecLabelItem *labels;
int nlabels;
int i;
- PQExpBuffer query;
/* do nothing, if --no-security-labels is supplied */
if (dopt->no_security_labels)
- return;
+ return false;
/* Security labels are schema not data ... except blob labels are data */
if (strcmp(type, "LARGE OBJECT") != 0)
{
if (dopt->dataOnly)
- return;
+ return false;
}
else
{
/* We do dump blob security labels in binary-upgrade mode */
if (dopt->schemaOnly && !dopt->binary_upgrade)
- return;
+ return false;
}
/* Search for security labels associated with catalogId, using table */
nlabels = findSecLabels(fout, catalogId.tableoid, catalogId.oid, &labels);
- query = createPQExpBuffer();
-
for (i = 0; i < nlabels; i++)
{
/*
@@ -15214,22 +15325,11 @@ dumpSecLabel(Archive *fout, const char *type, const char *name,
if (query->len > 0)
{
- PQExpBuffer tag = createPQExpBuffer();
-
appendPQExpBuffer(tag, "%s %s", type, name);
- ArchiveEntry(fout, nilCatalogId, createDumpId(),
- ARCHIVE_OPTS(.tag = tag->data,
- .namespace = namespace,
- .owner = owner,
- .description = "SECURITY LABEL",
- .section = SECTION_NONE,
- .createStmt = query->data,
- .deps = &dumpId,
- .nDeps = 1));
- destroyPQExpBuffer(tag);
+ return true;
}
- destroyPQExpBuffer(query);
+ return false;
}
/*
diff --git a/src/bin/pg_dump/pg_restore.c b/src/bin/pg_dump/pg_restore.c
index 589b4ae..b16db03 100644
--- a/src/bin/pg_dump/pg_restore.c
+++ b/src/bin/pg_dump/pg_restore.c
@@ -59,6 +59,7 @@ main(int argc, char **argv)
int c;
int exit_code;
int numWorkers = 1;
+ int blobBatchSize = 0;
Archive *AH;
char *inputFileSpec;
static int disable_triggers = 0;
@@ -120,6 +121,7 @@ main(int argc, char **argv)
{"no-publications", no_argument, &no_publications, 1},
{"no-security-labels", no_argument, &no_security_labels, 1},
{"no-subscriptions", no_argument, &no_subscriptions, 1},
+ {"restore-blob-batch-size", required_argument, NULL, 4},
{NULL, 0, NULL, 0}
};
@@ -280,6 +282,10 @@ main(int argc, char **argv)
set_dump_section(optarg, &(opts->dumpSections));
break;
+ case 4: /* # of blobs to restore per transaction */
+ blobBatchSize = atoi(optarg);
+ break;
+
default:
fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
exit_nicely(1);
@@ -434,6 +440,7 @@ main(int argc, char **argv)
SortTocFromFile(AH);
AH->numWorkers = numWorkers;
+ AH->blobBatchSize = blobBatchSize;
if (opts->tocSummary)
PrintTOCSummary(AH);
@@ -506,6 +513,8 @@ usage(const char *progname)
printf(_(" --use-set-session-authorization\n"
" use SET SESSION AUTHORIZATION commands instead of\n"
" ALTER OWNER commands to set ownership\n"));
+ printf(_(" --restore-blob-batch-size=NUM\n"
+ " attempt to restore NUM large objects per transaction\n"));
printf(_("\nConnection options:\n"));
printf(_(" -h, --host=HOSTNAME database server host or socket directory\n"));
diff --git a/src/bin/pg_upgrade/option.c b/src/bin/pg_upgrade/option.c
index 9c9b313..868e9f6 100644
--- a/src/bin/pg_upgrade/option.c
+++ b/src/bin/pg_upgrade/option.c
@@ -57,6 +57,8 @@ parseCommandLine(int argc, char *argv[])
{"verbose", no_argument, NULL, 'v'},
{"clone", no_argument, NULL, 1},
{"index-collation-versions-unknown", no_argument, NULL, 2},
+ {"restore-jobs", required_argument, NULL, 3},
+ {"restore-blob-batch-size", required_argument, NULL, 4},
{NULL, 0, NULL, 0}
};
@@ -208,6 +210,14 @@ parseCommandLine(int argc, char *argv[])
user_opts.ind_coll_unknown = true;
break;
+ case 3:
+ user_opts.restore_jobs = atoi(optarg);
+ break;
+
+ case 4:
+ user_opts.blob_batch_size = atoi(optarg);
+ break;
+
default:
fprintf(stderr, _("Try \"%s --help\" for more information.\n"),
os_info.progname);
@@ -314,6 +324,8 @@ usage(void)
printf(_(" --clone clone instead of copying files to new cluster\n"));
printf(_(" --index-collation-versions-unknown\n"));
printf(_(" mark text indexes as needing to be rebuilt\n"));
+ printf(_(" --restore-blob-batch-size=NUM attempt to restore NUM large objects per\n"));
+ printf(_(" transaction\n"));
printf(_(" -?, --help show this help, then exit\n"));
printf(_("\n"
"Before running pg_upgrade you must:\n"
diff --git a/src/bin/pg_upgrade/pg_upgrade.c b/src/bin/pg_upgrade/pg_upgrade.c
index e23b8ca..095e980 100644
--- a/src/bin/pg_upgrade/pg_upgrade.c
+++ b/src/bin/pg_upgrade/pg_upgrade.c
@@ -385,10 +385,13 @@ create_new_objects(void)
parallel_exec_prog(log_file_name,
NULL,
"\"%s/pg_restore\" %s %s --exit-on-error --verbose "
+ "--jobs %d --restore-blob-batch-size %d "
"--dbname template1 \"%s\"",
new_cluster.bindir,
cluster_conn_opts(&new_cluster),
create_opts,
+ user_opts.restore_jobs,
+ user_opts.blob_batch_size,
sql_file_name);
}
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index 919a784..5647f96 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -291,6 +291,8 @@ typedef struct
* changes */
transferMode transfer_mode; /* copy files or link them? */
int jobs; /* number of processes/threads to use */
+ int restore_jobs; /* number of pg_restore --jobs to use */
+ int blob_batch_size; /* number of blobs to restore per xact */
char *socketdir; /* directory to use for Unix sockets */
bool ind_coll_unknown; /* mark unknown index collation versions */
} UserOpts;
^ permalink raw reply [nested|flat] 43+ messages in thread
end of thread, other threads:[~2021-03-24 16:05 UTC | newest]
Thread overview: 43+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2021-03-08 11:02 Re: pg_upgrade failing for 200+ million Large Objects Tharakan, Robins <[email protected]>
2021-03-08 12:33 ` Magnus Hagander <[email protected]>
2021-03-08 14:13 ` Robins Tharakan <[email protected]>
2021-03-08 16:33 ` Tom Lane <[email protected]>
2021-03-08 16:35 ` Magnus Hagander <[email protected]>
2021-03-08 16:58 ` Tom Lane <[email protected]>
2021-03-08 17:18 ` Magnus Hagander <[email protected]>
2021-03-20 04:39 ` Jan Wieck <[email protected]>
2021-03-20 15:17 ` Andrew Dunstan <[email protected]>
2021-03-20 15:23 ` Tom Lane <[email protected]>
2021-03-20 16:45 ` Bruce Momjian <[email protected]>
2021-03-20 16:53 ` Tom Lane <[email protected]>
2021-03-20 16:55 ` Jan Wieck <[email protected]>
2021-03-21 11:47 ` Andrew Dunstan <[email protected]>
2021-03-21 16:56 ` Jan Wieck <[email protected]>
2021-03-21 18:18 ` Andrew Dunstan <[email protected]>
2021-03-22 21:36 ` Zhihong Yu <[email protected]>
2021-03-22 23:18 ` Jan Wieck <[email protected]>
2021-03-23 12:51 ` Jan Wieck <[email protected]>
2021-03-23 14:56 ` Bruce Momjian <[email protected]>
2021-03-23 17:25 ` Jan Wieck <[email protected]>
2021-03-23 18:06 ` Bruce Momjian <[email protected]>
2021-03-23 18:23 ` Jan Wieck <[email protected]>
2021-03-23 18:25 ` Bruce Momjian <[email protected]>
2021-03-23 18:35 ` Tom Lane <[email protected]>
2021-03-23 18:54 ` Jan Wieck <[email protected]>
2021-03-23 18:59 ` Tom Lane <[email protected]>
2021-03-23 19:22 ` Jan Wieck <[email protected]>
2021-03-23 19:35 ` Tom Lane <[email protected]>
2021-03-23 19:59 ` Jan Wieck <[email protected]>
2021-03-23 20:55 ` Tom Lane <[email protected]>
2021-03-24 16:04 ` Jan Wieck <[email protected]>
2021-03-24 16:05 ` Jan Wieck <[email protected]>
2021-03-21 16:50 ` Fix pg_upgrade to preserve datdba (was: Re: pg_upgrade failing for 200+ million Large Objects) Jan Wieck <[email protected]>
2021-03-21 16:57 ` Re: Fix pg_upgrade to preserve datdba (was: Re: pg_upgrade failing for 200+ million Large Objects) Tom Lane <[email protected]>
2021-03-21 17:15 ` Re: Fix pg_upgrade to preserve datdba Jan Wieck <[email protected]>
2021-03-21 17:50 ` Re: Fix pg_upgrade to preserve datdba Jan Wieck <[email protected]>
2021-03-21 18:23 ` Re: Fix pg_upgrade to preserve datdba Tom Lane <[email protected]>
2021-03-21 18:34 ` Re: Fix pg_upgrade to preserve datdba Tom Lane <[email protected]>
2021-03-21 19:29 ` Re: Fix pg_upgrade to preserve datdba Tom Lane <[email protected]>
2021-03-21 19:36 ` Re: Fix pg_upgrade to preserve datdba Jan Wieck <[email protected]>
2021-03-21 19:56 ` Re: Fix pg_upgrade to preserve datdba Tom Lane <[email protected]>
2021-03-22 18:07 ` Re: Fix pg_upgrade to preserve datdba Jan Wieck <[email protected]>
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox