Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sWcoY-00GIeW-0i for pgsql-hackers@arkaria.postgresql.org; Wed, 24 Jul 2024 14:17:57 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1sWcoW-005GNQ-Ev for pgsql-hackers@arkaria.postgresql.org; Wed, 24 Jul 2024 14:17:56 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sWcoW-005GNI-5S for pgsql-hackers@lists.postgresql.org; Wed, 24 Jul 2024 14:17:56 +0000 Received: from charmander.telsasoft.com ([50.244.222.1] helo=pryzbyj2023.telsasoft) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sWcoT-001DMD-F2 for pgsql-hackers@postgresql.org; Wed, 24 Jul 2024 14:17:54 +0000 Received: by pryzbyj2023.telsasoft (Postfix, from userid 1000) id 3C30A2EDB; Wed, 24 Jul 2024 09:17:51 -0500 (CDT) Date: Wed, 24 Jul 2024 09:17:51 -0500 From: Justin Pryzby To: Tom Lane Cc: Nathan Bossart , Michael Banck , Laurenz Albe , vignesh C , "Kumar, Sachin" , Robins Tharakan , Jan Wieck , Bruce Momjian , Andrew Dunstan , Magnus Hagander , Peter Eisentraut , pgsql-hackers@postgresql.org Subject: Re: pg_upgrade failing for 200+ million Large Objects Message-ID: References: <842242.1706287466@sss.pgh.pa.us> <4a3ebf7d81bfc6dd4d545e5b27d6e8f6c32d8937.camel@cybertec.at> <3023817.1710629175@sss.pgh.pa.us> <6603e4e0.500a0220.a557f.4f39@mx.google.com> <3304322.1711551245@sss.pgh.pa.us> <20240327150826.GB3994937@nathanxps13> <20240401191930.GA2302032@nathanxps13> <1217588.1711999706@sss.pgh.pa.us> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1217588.1711999706@sss.pgh.pa.us> List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Mon, Apr 01, 2024 at 03:28:26PM -0400, Tom Lane wrote: > Nathan Bossart writes: > > The one design point that worries me a little is the non-configurability of > > --transaction-size in pg_upgrade. I think it's fine to default it to 1,000 > > or something, but given how often I've had to fiddle with > > max_locks_per_transaction, I'm wondering if we might regret hard-coding it. > > Well, we could add a command-line switch to pg_upgrade, but I'm > unconvinced that it'd be worth the trouble. I think a very large > fraction of users invoke pg_upgrade by means of packager-supplied > scripts that are unlikely to provide a way to pass through such > a switch. I'm inclined to say let's leave it as-is until we get > some actual field requests for a switch. I've been importing our schemas and doing upgrade testing, and was surprised when a postgres backend was killed for OOM during pg_upgrade: Killed process 989302 (postgres) total-vm:5495648kB, anon-rss:5153292kB, ... Upgrading from v16 => v16 doesn't use nearly as much RAM. While tracking down the responsible commit, I reproduced the problem using a subset of tables; at 959b38d770, the backend process used ~650 MB RAM, and at its parent commit used at most ~120 MB. 959b38d770b Invent --transaction-size option for pg_restore. By changing RESTORE_TRANSACTION_SIZE to 100, backend RAM use goes to 180 MB during pg_upgrade, which is reasonable. With partitioning, we have a lot of tables, some of them wide (126 partitioned tables, 8942 childs, total 1019315 columns). I didn't track if certain parts of our schema contribute most to the high backend mem use, just that it's now 5x (while testing a subset) to 50x higher. We'd surely prefer that the transaction size be configurable. -- Justin