public inbox for [email protected]  
help / color / mirror / Atom feed
From: Thomas Munro <[email protected]>
To: Alvaro Herrera <[email protected]>
Cc: Pecsök Ján <[email protected]>
Cc: [email protected] <[email protected]>
Cc: Andres Freund <[email protected]>
Subject: Re: Error:could not extend file " with FileFallocate(): No space left on device
Date: Thu, 12 Sep 2024 09:36:50 +1200
Message-ID: <CA+hUKGKM16mbwCvyRCJZHM-LdC+ufK0b09Wnmh9ivGQDi6bv6g@mail.gmail.com> (raw)
In-Reply-To: <[email protected]>
References: <AS1PR05MB9105F5A24271EEC357DA5D919F9B2@AS1PR05MB9105.eurprd05.prod.outlook.com>
	<[email protected]>

On Thu, Sep 12, 2024 at 12:39 AM Alvaro Herrera <[email protected]> wrote:
>> On 2024-Sep-11, Pecsök Ján wrote:
> > In our case:
> > Kernel: Linux version 4.18.0-513.18.1.el8_9.ppc64le ([email protected]) (gcc version 8.5.0 20210514 (Red Hat 8.5.0-20) (GCC)) #1 SMP Thu Feb 1 02:52:53 EST 2024
> > File systém type:xfs
>
> Can you please share the output of xfs_info for the filesystem(s) used?
>
> Apparently, it's possible for allocation groups to be suboptimally laid
> out in a way that leads to ENOSPC with space still available.

Hmm, I have no clues about that, though I do remember reports of
spurious ENOSPC errors from xfs many years ago on some other database
I was around maybe in the era of that kernel or a bit older.

Actually I was already wondering if we need to add a tunable to
control that the heuristic that redirects to posix_fallocate():

https://www.postgresql.org/message-id/flat/CAMazQQfp%2B3f8tD_Q23rCR%3DO%2BJj4jouSRVigbD8OmrTOfHV%2B8...

There's no confirmation that writing zeros would be a useful
workaround here, though.  Two things changed in 16: the fallocate()
path was invented, but also we started extending by more than one
block at a time, which might take the pwritev() path or the
fallocate() path, for bulk insertion via COPY.  That btrfs user would
prefer pwritev() always IIRC, but if some version of xfs is alergic to
this pattern I don't know if it's the size or the system call that's
triggering it...

Is COPY used here?

And just for curiosity (I don't see any particular connection or what
to do about it either way in the short term), are we talking about
really big tables with lots of 1GB files named N.1, N.2, N.3, ...
files, or millions of smaller tables?  I kinda wonder if xfs (and any
file system really) would really prefer us to use large files instead
(patches exist for this), and when many-terabyte clusters start
working with huge numbers of segments, we reach fun new kinds of
internal resource exhaustion, or something like that....

. o O { I particularly dislike our habit of synthesising fake ENOSPC
errors in a few code paths... grumble }






view thread (6+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Error:could not extend file " with FileFallocate(): No space left on device
  In-Reply-To: <CA+hUKGKM16mbwCvyRCJZHM-LdC+ufK0b09Wnmh9ivGQDi6bv6g@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox