public inbox for [email protected]  
help / color / mirror / Atom feed
From: Alexander Korotkov <[email protected]>
To: Tom Lane <[email protected]>
Cc: Justin Pryzby <[email protected]>
Cc: Nathan Bossart <[email protected]>
Cc: Michael Banck <[email protected]>
Cc: Laurenz Albe <[email protected]>
Cc: vignesh C <[email protected]>
Cc: Kumar, Sachin <[email protected]>
Cc: Robins Tharakan <[email protected]>
Cc: Jan Wieck <[email protected]>
Cc: Bruce Momjian <[email protected]>
Cc: Andrew Dunstan <[email protected]>
Cc: Magnus Hagander <[email protected]>
Cc: Peter Eisentraut <[email protected]>
Cc: [email protected]
Subject: Re: pg_upgrade failing for 200+ million Large Objects
Date: Sat, 27 Jul 2024 01:55:00 +0300
Message-ID: <CAPpHfdtpr2FYVYqmARZdSp=2zsndgJgmSsHFxFFNTiYRfX8uaA@mail.gmail.com> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<20240327150826.GB3994937@nathanxps13>
	<20240401191930.GA2302032@nathanxps13>
	<[email protected]>
	<ZqEND4ZcTDBmcv31@pryzbyj2023>
	<CAPpHfduUSA7c0YrTxN443SwXXCL41vmXkavYJjN8GS3Wc5Z42w@mail.gmail.com>
	<ZqQIxIsaPWO6coil@pryzbyj2023>
	<[email protected]>

On Sat, Jul 27, 2024 at 1:37 AM Tom Lane <[email protected]> wrote:
> Justin Pryzby <[email protected]> writes:
> > On Fri, Jul 26, 2024 at 10:53:30PM +0300, Alexander Korotkov wrote:
> >> It would be nice to identify such cases and check which memory contexts are
> >> growing and why.
>
> > I reproduced the problem with this schema:
>
> > SELECT format('CREATE TABLE p(i int, %s) PARTITION BY RANGE(i)', array_to_string(a, ', ')) FROM (SELECT array_agg(format('i%s int', i))a FROM generate_series(1,999)i);
> > SELECT format('CREATE TABLE t%s PARTITION OF p FOR VALUES FROM (%s)TO(%s)', i,i,i+1) FROM generate_series(1,999)i;
>
> > This used over 4 GB of RAM.
>
> Interesting.  This doesn't bloat particularly much in a regular
> pg_restore, even with --transaction-size=1000; but it does in
> pg_upgrade, as you say.  I found that the bloat was occurring
> during these long sequences of UPDATE commands issued by pg_upgrade:
>
> -- For binary upgrade, recreate inherited column.
> UPDATE pg_catalog.pg_attribute
> SET attislocal = false
> WHERE attname = 'i'
>   AND attrelid = '\"public\".\"t139\"'::pg_catalog.regclass;
>
> -- For binary upgrade, recreate inherited column.
> UPDATE pg_catalog.pg_attribute
> SET attislocal = false
> WHERE attname = 'i1'
>   AND attrelid = '\"public\".\"t139\"'::pg_catalog.regclass;
>
> -- For binary upgrade, recreate inherited column.
> UPDATE pg_catalog.pg_attribute
> SET attislocal = false
> WHERE attname = 'i2'
>   AND attrelid = '\"public\".\"t139\"'::pg_catalog.regclass;
>
> I think the problem is basically that each one of these commands
> causes a relcache inval, for which we can't reclaim space right
> away, so that we end up consuming O(N^2) cache space for an
> N-column inherited table.

I was about to report the same.

> It's fairly easy to fix things so that this example doesn't cause
> that to happen: we just need to issue these updates as one command
> not N commands per table.  See attached.  However, I fear this should
> just be considered a draft, because the other code for binary upgrade
> in the immediate vicinity is just as aggressively stupid and
> unoptimized as this bit, and can probably also be driven to O(N^2)
> behavior with enough CHECK constraints etc.  We've gone out of our way
> to make ALTER TABLE capable of handling many updates to a table's DDL
> in one command, but whoever wrote this code appears not to have read
> that memo, or at least to have believed that performance of pg_upgrade
> isn't of concern.

I was thinking about counting actual number of queries, not TOC
entries for transaction number as a more universal solution.  But that
would require usage of psql_scan() or writing simpler alternative for
this particular purpose.  That looks quite annoying.  What do you
think?


------
Regards,
Alexander Korotkov
Supabase






view thread (15+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: pg_upgrade failing for 200+ million Large Objects
  In-Reply-To: <CAPpHfdtpr2FYVYqmARZdSp=2zsndgJgmSsHFxFFNTiYRfX8uaA@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox