Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sXTqI-004GVt-Uc for pgsql-hackers@arkaria.postgresql.org; Fri, 26 Jul 2024 22:55:18 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1sXTqG-000uxd-IU for pgsql-hackers@arkaria.postgresql.org; Fri, 26 Jul 2024 22:55:16 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sXTqG-000uxP-7j for pgsql-hackers@lists.postgresql.org; Fri, 26 Jul 2024 22:55:16 +0000 Received: from mail-yb1-xb30.google.com ([2607:f8b0:4864:20::b30]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1sXTqD-001cqA-JY for pgsql-hackers@postgresql.org; Fri, 26 Jul 2024 22:55:15 +0000 Received: by mail-yb1-xb30.google.com with SMTP id 3f1490d57ef6-e05e94a979eso136441276.0 for ; Fri, 26 Jul 2024 15:55:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1722034512; x=1722639312; darn=postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=abNEQ1cjoXuIS4EX/tgL1kT4OloRuiRDLLy1tMm0rkA=; b=dH0n4LDFlyeLrA0MwAspmSeEGTqFpmNzl4R2UbspNaV6z7w1dFpd080Iclrlu9TkbZ 8aefI6bgoLUiekr1z5cfgYYpMSOjKUDxCzRJpm3wnztwbfMMUL14+Wj3wRDDgCcKTDzU gReDlIbzXjv1BgWi0bNR0C1NRFzBLuCwXNi90AkjCwWGAMK19Db/iTxl8Tkn+tNd1raZ UjsRRR8QxVoTpfsNdIM3bnVZUZ6rGh5e2nkIywBGPNWqzPah82Bcwe0qoRBSgnbWFucZ w43cONBu81QcRuLzZhTVOYGCPDT8gIxOe0jNyLpfRBqMWo9cb6R9T5vJb92PL1PZLba/ Zh0Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722034512; x=1722639312; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=abNEQ1cjoXuIS4EX/tgL1kT4OloRuiRDLLy1tMm0rkA=; b=GLiMT6joWFMSgZrwRQoRnAd7JLJ81zSoQW9kzBDxD5ytmBR78GLs4oBXKG7aG3lB89 faXnGMPz779rJm47Zcr3em/RrmoTKuGd5YWsLbV6AH8LFSRQHsZWmM9QjwxrhHkFIvs8 oJld6MkWge1NN1BIeT0B8ivZPPcaffdj/G9QCX5s/g/xRofjYvEEVgF6VaLzroZdKNZm Pw4+GKMU3uV3nFd9TeZdy236wVxSQd+WBrgkZG4UgSelhv3bZmaQSPlvmFTHWknrO3rx W1GYaZHHuRAGNPLMegl6/iVtrRmMPoiMhAVltQeLIncjT+BKaqIPHNc/rFxjhQbKCRqQ t+Yw== X-Forwarded-Encrypted: i=1; AJvYcCXeRIBqkVk4sfVq5Arhxr/AOboCd4kc7Ss6VGz2/mz3qQ9Cv0HhPgIX/E2VxuspC4XfgtaErmqOESd0H5TlQyvDb6SPoVtJb2ZKvLQn X-Gm-Message-State: AOJu0YwJDFYO+sL9C12HCKaAKnBI9W0wAkfmJsDkhxAyQw5UBOxyQJ98 50/upMBygxHyH97vbRID5IEMaWikMT8IpQeJePn/D4kyBSYyHmV6uPbyrjKDO0wWvgy93xDye+u st7g1iXf31ErLin2+m3Uqh3GBA8A= X-Google-Smtp-Source: AGHT+IF+6xE9xvqBOl976kXGjC8R64zHqUCm6Ds1twoksBU2/woVMH5/JttEiqU54D16YuIb/J7GIQm557PhKftVvqU= X-Received: by 2002:a05:6902:2402:b0:e0b:49e0:1422 with SMTP id 3f1490d57ef6-e0b55989cdbmr1108617276.24.1722034512441; Fri, 26 Jul 2024 15:55:12 -0700 (PDT) MIME-Version: 1.0 References: <4a3ebf7d81bfc6dd4d545e5b27d6e8f6c32d8937.camel@cybertec.at> <3023817.1710629175@sss.pgh.pa.us> <6603e4e0.500a0220.a557f.4f39@mx.google.com> <3304322.1711551245@sss.pgh.pa.us> <20240327150826.GB3994937@nathanxps13> <20240401191930.GA2302032@nathanxps13> <1217588.1711999706@sss.pgh.pa.us> <1870579.1722033430@sss.pgh.pa.us> In-Reply-To: <1870579.1722033430@sss.pgh.pa.us> From: Alexander Korotkov Date: Sat, 27 Jul 2024 01:55:00 +0300 Message-ID: Subject: Re: pg_upgrade failing for 200+ million Large Objects To: Tom Lane Cc: Justin Pryzby , Nathan Bossart , Michael Banck , Laurenz Albe , vignesh C , "Kumar, Sachin" , Robins Tharakan , Jan Wieck , Bruce Momjian , Andrew Dunstan , Magnus Hagander , Peter Eisentraut , pgsql-hackers@postgresql.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Sat, Jul 27, 2024 at 1:37=E2=80=AFAM Tom Lane wrote: > Justin Pryzby writes: > > On Fri, Jul 26, 2024 at 10:53:30PM +0300, Alexander Korotkov wrote: > >> It would be nice to identify such cases and check which memory context= s are > >> growing and why. > > > I reproduced the problem with this schema: > > > SELECT format('CREATE TABLE p(i int, %s) PARTITION BY RANGE(i)', array_= to_string(a, ', ')) FROM (SELECT array_agg(format('i%s int', i))a FROM gene= rate_series(1,999)i); > > SELECT format('CREATE TABLE t%s PARTITION OF p FOR VALUES FROM (%s)TO(%= s)', i,i,i+1) FROM generate_series(1,999)i; > > > This used over 4 GB of RAM. > > Interesting. This doesn't bloat particularly much in a regular > pg_restore, even with --transaction-size=3D1000; but it does in > pg_upgrade, as you say. I found that the bloat was occurring > during these long sequences of UPDATE commands issued by pg_upgrade: > > -- For binary upgrade, recreate inherited column. > UPDATE pg_catalog.pg_attribute > SET attislocal =3D false > WHERE attname =3D 'i' > AND attrelid =3D '\"public\".\"t139\"'::pg_catalog.regclass; > > -- For binary upgrade, recreate inherited column. > UPDATE pg_catalog.pg_attribute > SET attislocal =3D false > WHERE attname =3D 'i1' > AND attrelid =3D '\"public\".\"t139\"'::pg_catalog.regclass; > > -- For binary upgrade, recreate inherited column. > UPDATE pg_catalog.pg_attribute > SET attislocal =3D false > WHERE attname =3D 'i2' > AND attrelid =3D '\"public\".\"t139\"'::pg_catalog.regclass; > > I think the problem is basically that each one of these commands > causes a relcache inval, for which we can't reclaim space right > away, so that we end up consuming O(N^2) cache space for an > N-column inherited table. I was about to report the same. > It's fairly easy to fix things so that this example doesn't cause > that to happen: we just need to issue these updates as one command > not N commands per table. See attached. However, I fear this should > just be considered a draft, because the other code for binary upgrade > in the immediate vicinity is just as aggressively stupid and > unoptimized as this bit, and can probably also be driven to O(N^2) > behavior with enough CHECK constraints etc. We've gone out of our way > to make ALTER TABLE capable of handling many updates to a table's DDL > in one command, but whoever wrote this code appears not to have read > that memo, or at least to have believed that performance of pg_upgrade > isn't of concern. I was thinking about counting actual number of queries, not TOC entries for transaction number as a more universal solution. But that would require usage of psql_scan() or writing simpler alternative for this particular purpose. That looks quite annoying. What do you think? ------ Regards, Alexander Korotkov Supabase