Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1u3CiU-00C6Z4-99 for pgsql-general@arkaria.postgresql.org; Fri, 11 Apr 2025 11:38:38 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1u3CiR-008VVO-Hp for pgsql-general@arkaria.postgresql.org; Fri, 11 Apr 2025 11:38:35 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1u3CiQ-008VVG-V2 for pgsql-general@lists.postgresql.org; Fri, 11 Apr 2025 11:38:35 +0000 Received: from mout.gmx.net ([212.227.17.22]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1u3CiO-004EFU-3C for pgsql-general@lists.postgresql.org; Fri, 11 Apr 2025 11:38:34 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmx.net; s=s31663417; t=1744371510; x=1744976310; i=jimis@gmx.net; bh=fbzTGIZkcj3ihagIYXDOX62L9RYg8DH4v9sWVO37pV8=; h=X-UI-Sender-Class:Date:From:To:Subject:In-Reply-To:Message-ID: References:MIME-Version:Content-Type:Content-Transfer-Encoding:cc: content-transfer-encoding:content-type:date:from:message-id: mime-version:reply-to:subject:to; b=e+6JSuUd07KxPgCnZDjqH9YOIkJ4snmtDLuQydNczmm5iRobXWHzyeTcKkQo77vg D/vBFYBVwn1PwtpRjtopKr8dWKvKFUGoFC5S0rV9VumKUckG3+vukgvMht9LPbb5F JzHa3+bahjMaB3dh296Kzz/cpt5wpdqusCDmb6TewRH3Z+ICl2LvZxubEpGFRvqKY WqVdZeImBnFl5fFe5VCRQmwMDIySj6BgaWozhemYBW6UL58WdkzjZV7viQSxMoI6T WVZQIH5qYySC4BH/QRXt/zJflWaI5uk7LgvbwuJd4Hbxx9vXpn+40ABAq5WLpWaYT GGtXvF41qX6wvghing== X-UI-Sender-Class: 724b4f7f-cbec-4199-ad4e-598c01a50d3a Received: from [10.9.70.81] ([185.55.106.54]) by mail.gmx.net (mrgmx105 [212.227.17.168]) with ESMTPSA (Nemesis) id 1MfpOT-1tNWAh2UDB-00e3UF for ; Fri, 11 Apr 2025 13:38:30 +0200 Date: Fri, 11 Apr 2025 13:38:26 +0200 (CEST) From: Dimitrios Apostolou To: pgsql-general@lists.postgresql.org Subject: Re: pg_restore causing ENOSPACE on the WAL partition In-Reply-To: <076464ad-3d70-dd25-9e8f-e84f27decfba@gmx.net> Message-ID: <738db902-9462-8566-3133-8f5f127352d0@gmx.net> References: <076464ad-3d70-dd25-9e8f-e84f27decfba@gmx.net> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed X-Provags-ID: V03:K1:AT1DdjGQkCF75FTNKlkvgF4SmITqTyPjOv6LMBGS+iDQsfMuwrU 85hPY2/S1G26xhwOyqOe2rfW38PSeAjYMuz2tCk7XhTilV+SeCh4Akb9bVcs43CQmHCdS4K QVl32sPOgWdCf2yD75IaKmK/6fKAfQIzbP2vEJjm96JeYsbBaF/R9O833fYDcT9baGxUZsz o8j9cGA6pz+kKyxhK2BBA== X-Spam-Flag: NO UI-OutboundReport: notjunk:1;M01:P0:uorlErIsibU=;0vpB2BKir6QFhUcgX8gd9gF/Wr6 q6u1lq91jLmHR/F+OzGkQszvBLBJNk9pxQUY+xvJj0Jfi5DW9SjXaNlnRFl1TxI8ajSaJ6z0h 5WcnO2Rq7Z8KQuyBBa15Fut+C+tK3UlSXRFG8i95ii0MrU6JZ27F6i1Ds/sdvH0ZmJKDLx/hU c9Bpykr3C4wWdcMOoOxMCaTS5+5zPl91ScRrMCF5jdBenGMVm23Ig8CL5YYy8+K3KEq5fMrht NB01MxiPu0GvPO3qbOnJI1Aek3EuJVz3OX4nuAJEnhIw8+fxiwUqgxePNhQg3AJzYu+BXk21F S7TijR7sLmk+dklV7ytXloftLxBqXMiwRvi0P7oMxnBa6o3jUtRht6/av5DmLPTtdKrP8MYLa Dabny48bIZM1mbhOfTtubTeGltQq2qHCm1EQyJ038O7OQEl5WJ0+i/p08zlMTfbOr7oJyi0VQ uCpn7DEQZV73c+6hnk+vsAMjpizLoOxUWjeKcLGIMxedcEIAoVWIQeyxgA9+M7Rfd9T/+U2Qi 2vyMx3qrZI5T2Z3ZhyXaQUMh4wLqnGqgla1fb76Ks/3IN8xBN/S0NwNE8BN6J02ojSwAp7z5S pJ7VvXAUFBbZwHZnwt0/w2W7crNMPHN2tuvk241FdhVU5Gl2yKjiCQTtm4N2aOAT4LK4aW4mU fcy30Y74fA5hk2tDtoylIbwP5zicI35ygN2eBaE7tIwtQSUHwB+WWkAIu2/U9aZiHJVdV9zZS ihmJ5iG/Fxx+Xnbud+5kNxT4di6cqaiMtibpZwz4TVGlKP3mje8LRNwP36sidjO9cStGInCqH TMY7SjVnX1rzjPpwSE+QZX8/EMOrqNqRlFiuArh/mrfMuUttc2b9C8ZqArn+a3YCgxxnDMmyO 82JVIKRNrGd7G+Fo7ZmA3cDCdiQRNBTc8dSaWlT3nQwnzrSOnZemcc16wzQltHGpetf3gucYw qibxoxHF3VGTPzAAylkXcQEJGJtwCJX5Tl3hvoDNXPV1vbuvzzBelzWPlfI5NVT4VFdeVhCYy IXeGp/64cfUy8IxCkwGiqlksWsWYigVZj0vCZ4Jghkxv3WNcHlIArwKwqxDqnIJpi4rD8QZUD lxcgh7ubJoMZi+HZz8cezdEBZPUX4LHsOO7X/X2sPSfKb3ycQSBWtfij7e3hEOcDMhUdbuN+D 59XuVKKRdH5Je6s0CJn2qWiObP0Vv458MpJ7BUyqPGA8re9SFemkls+MZUfPqOQKl3yn/9nBw Y6mbNlMeMIzCR0K53ZSNHxh8lzVXbIiedZu5Uv7Z6EL9AmFHq6zNJUNsnXBXgrRnDW6m6nAxT /hHmDD+3E+NoM3Sk4aQ3RTnb7W/NDiWWvWOHgc+KbqgkdlPWEAORwRIOqtKVuZoTiHuLjQcUP jJUNywAzXV8JexZu/l5h+vfSfqqvFWm20OZDssgU8CwXWJsbodT0sPtEPjxb9z1AGjgSABug4 MukvZWFKa/Gv3LnCDBQ8AntVb4QnS8MX2ntx+hq0Cwkyps/XVBNYRnnj5U9gpOmt+IjL6MopZ zWspNdRImGUzw+L5oPf7pJBX/InEKD3uMF1skpcDKDoXxtYDhPLq4teyf6a6lPOaKU4tbtb91 1evYP0WXubeiufJzBgpLOSoeLBH8Zp1yyuuGmTl0rYzl1Omy93/OcP6wOjsN6G/gc3URRXwvG CJ27/ta5ywgGgTM1vpRciRupQkmd5MW2/BGY4bwu0yWEg7b64qawz+bj7+2L/2Ptknarynq7X LtTcpbY76pIr+yWMZoSYiItIwLXTXRVOIgLTtPSNMtV7Sc+WDf9o9eubkaJQXPgm2oJ1wF76f GFaVc9b/a/ryJfJYHGZq1Oot9KVZDjWqmNizL/V5ogvKauXMBLUasW6e/lTvw9nC6GEvDNSRx 09TLH3VgdMMkIjZPmo2vhNWX4mRhlci9DRoxn6e18K/hjjbPC7IVibcfQY0jq5HkHnvcFORHK y9yqCWAo/KhWP6qQ1IZh1jeVRcjibf/kGj/rnSfJKg0LaTXk1KsRMRsZmVIqpIRjMCtBlAmM4 6c2Ew82fn3RWF3kf/9bJK8Z+GW0XLMT47WE7slFi7CfgAq6e/MdHucHgvDgs49H8PBOSQHTMQ nqdAkAE+ybqJGqGysLHKqH9uA9lJco6EITTHwf5lhWiMkypq6Kl0MSHN/l9nnmVA/VLTqfaAU bGVT0jYT7BHLYUkA2BHmBJnatbHRa2cBL4iuqrN/jWg3UATVldXoQKAsI0pNs+2bWEakq4909 bOskqzXg/j9laSopUPmr5b5cHeEjn0tX0rQTTY4ZQ+RrBSRApWiSAy+waU1V1hW9dgM+SBOmb n8ZUHnHK2LR+Mox2AJlJz65b26vceqywEed5RARoNAp+v24SicSuHDjft8dI/88uEmRCA4x1w 95WzSc9dDo9YVusAVJ30FqtiSZmm4+4uFxwAwPb5+7fpXIanmM82KU0MS9SZYjrIwbCEyHb8s squtK4F9rAb1W4qmrsihjAc3rbi7/a4XLZhOZJHRSP5ZvAxX2lgrnd31dcOYzxpR8xoUulXCy VNlf6lwKuRZTgF5xdsVdGv23sBjlDEuG9oz3HUDZ40BQOq3VmRf9zpKUfYsSNWDQ5NNDxwAWm uZDeOUjMbFWz9V4oRLc18SMGu4KYMKQ7/eFbHJlZ0XMgSH0fVWdG8y3vC1qQNxH/yDEdUCufC Guy3mjrRnFsRtHdHZ0gxJXWQ6xbS5Ex9R1uHSQAxaVNK2IdUYwuGGKbGI3qKl/EgnBsz0Pr23 sNF7yUCDDsQPTaab8+GwbrClE8xR7Cnkym1Dps4rVcz6cSlrOAsjYxVcaiHLr4mMrN5mEtdNX O2KHitsWOBbFZOC1yW40MIdnYBDYs6uv8EV8EpD3nWMF4Ly5JddaDqY2XHyyjJ5odmUnz5pz9 jdog+ZfXnXi0Myj3jm2ilxeeeqC87y6VDJ7tX4w45sLVEQ1Hk+/0XzlZR60rVPX0wCjJHeK2Q 1GGaTGaAtQ== Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Happened again even with max_wal_size=3D32GB on the 128GB WAL partition. A= t this point I'm quite sure what happens: + a checkpoint is going on for several minutes (because of I/O contention on the archival drive) + meanwhile data keeps coming in through several processes of pg_restore + data is coming in much faster because the WAL partition is high perf + The checkpoint makes it before full and finishes and frees up 32GB (max_wal_size) from the WAL drive. Lets say the WAL partition is now at 80/128GB full. + Immediately another checkpoint starts. This one has to checkpoint 80GB of data. + This doesn't make it in time and the WAL partition fills up. + Painful recovery follows, the database server won't even succeed in doing recovery when starting up... I'm surprised that there is no mechanism for the backends to block while the WAL is overflowing. Dimitris On Thu, 10 Apr 2025, Dimitrios Apostolou wrote: > Hello list, > > last night I got ENOSPACE on the WAL partition while running a huge > pg_restore on an empty and idle database. > The checkpoint that started 7 minutes earlier never finished: > > 04:31:09 LOG: checkpoint starting: wal > ... > 04:38:04 PANIC: could not write to file "pg_wal/xlogtemp.546204": No > space left on device > : CONTEXT: COPY table_partition_123, line 120872534 > > This is most likely because most of the data is written to a very slow > "archival" tablespace. No I/O errors on the system logs, I know things g= o > slow on that device (iSCSI drive over a shared spinning-disks pool and b= usy > network). > > This happened even though I'm keeping the WAL in its own partition (128G= B > partition size), with a lot of free space above max_wal_size (64GB). Som= ehow > it managed to grow above max_wal_size and fill 100% of the partition. > > I'm running latest PostgreSQL 17 and the settings have been temporarily > tweaked for fastest pg_restore: > > max_wal_size=3D64GB > max_replication_slots =3D 0 > max_logical_replication_workers =3D 0 > max_wal_senders =3D 0 > wal_level =3D minimal > autovacuum =3D off > > Several things seem to have gone wrong here. Questions: > > + The WAL partition is much faster than the archival tablespace. Am I in > constant danger of overruning max_wal_size? How to make 100% sure this > never happens again? > > + After recovery, with the database idling, I notice that WAL space usag= e > is constant at 64GB. Why doesn't it free up space down to min_wal_size > (1GB)? > > + I just created a 13GB zstd-compressed tarball of those 64GB WAL > files. This indicates that the files are compressible despite using > wal_compression=3Dzstd setting. Could it be that postgres ignores the = flag > and does not compress the WAL? How to check? > > + I'm using parallel pg_restore --data-only, can't avoid that for now. > Even though all the tables are empty (I truncated everything before > starting pg_restore), I can't find a way to avoid going through the WA= L. > Ideas? > > > Thanks in advance, > Dimitris > > >