public inbox for [email protected]
help / color / mirror / Atom feedFrom: Andres Freund <[email protected]>
To: Thomas Munro <[email protected]>
Cc: Tom Lane <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Subject: Re: subscriptionCheck failures on nightjar
Date: Wed, 13 Feb 2019 13:51:47 -0800
Message-ID: <[email protected]> (raw)
In-Reply-To: <CAEepm=0wB7vgztC5sg2nmJ-H3bnrBT5GQfhUzP+Ffq-WT3g8VA@mail.gmail.com>
References: <[email protected]>
<[email protected]>
<[email protected]>
<[email protected]>
<[email protected]>
<[email protected]>
<[email protected]>
<[email protected]>
<[email protected]>
<CAEepm=0wB7vgztC5sg2nmJ-H3bnrBT5GQfhUzP+Ffq-WT3g8VA@mail.gmail.com>
Hi,
On 2019-02-14 09:52:33 +1300, Thomas Munro wrote:
> On Thu, Feb 14, 2019 at 8:11 AM Tom Lane <[email protected]> wrote:
> > Andres Freund <[email protected]> writes:
> > > I was kinda pondering just open coding it. I am not yet convinced that
> > > my idea of just using an open FD isn't the least bad approach for the
> > > issue at hand. What precisely is the NFS issue you're concerned about?
> >
> > I'm not sure that fsync-on-FD after the rename will work, considering that
> > the issue here is that somebody might've unlinked the file altogether
> > before we get to doing the fsync. I don't have a hard time believing that
> > that might result in a failure report on NFS or similar. Yeah, it's
> > hypothetical, but the argument that we need a repeat fsync at all seems
> > equally hypothetical.
> >
> > > Right now fsync_fname_ext isn't exposed outside fd.c...
> >
> > Mmm. That makes it easier to consider changing its API.
>
> Just to make sure I understand: it's OK for the file not to be there
> when we try to fsync it by name, because a concurrent checkpoint can
> remove it, having determined that we don't need it anymore? In other
> words, we really needed either missing_ok=true semantics, or to use
> the fd we already had instead of the name?
I'm not yet sure that that's actually something that's supposed to
happen, I got to spend some time analysing how this actually
happens. Normally the contents of the slot should actually prevent it
from being removed (as they're newer than
ReplicationSlotsComputeLogicalRestartLSN()). I kind of wonder if that's
a bug in the drop logic in newer releases.
Greetings,
Andres Freund
view thread (44+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected]
Subject: Re: subscriptionCheck failures on nightjar
In-Reply-To: <[email protected]>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox