Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1gtyoj-0002bj-PS for pgsql-hackers@arkaria.postgresql.org; Wed, 13 Feb 2019 17:59:29 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.89) (envelope-from ) id 1gtyoi-0006WN-G3 for pgsql-hackers@arkaria.postgresql.org; Wed, 13 Feb 2019 17:59:28 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1gtyoi-0006WF-8r for pgsql-hackers@lists.postgresql.org; Wed, 13 Feb 2019 17:59:28 +0000 Received: from sss.pgh.pa.us ([66.207.139.130]) by magus.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1gtyog-0000jG-7r for pgsql-hackers@lists.postgresql.org; Wed, 13 Feb 2019 17:59:27 +0000 Received: from sss1.sss.pgh.pa.us (localhost [127.0.0.1]) by sss.pgh.pa.us (8.14.4/8.14.4) with ESMTP id x1DHxJc4030609; Wed, 13 Feb 2019 12:59:19 -0500 From: Tom Lane To: Andres Freund cc: Thomas Munro , PostgreSQL Hackers Subject: Re: subscriptionCheck failures on nightjar In-reply-to: <20190213174151.mfylkessxmapt4io@alap3.anarazel.de> References: <17827.1549866683@sss.pgh.pa.us> <27965.1550077052@sss.pgh.pa.us> <20190213171101.6wpz7tardp3t3uvk@alap3.anarazel.de> <29708.1550079455@sss.pgh.pa.us> <20190213174151.mfylkessxmapt4io@alap3.anarazel.de> Comments: In-reply-to Andres Freund message dated "Wed, 13 Feb 2019 09:41:51 -0800" MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <30607.1550080759.1@sss.pgh.pa.us> Date: Wed, 13 Feb 2019 12:59:19 -0500 Message-ID: <30608.1550080759@sss.pgh.pa.us> List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Precedence: bulk Andres Freund writes: > On 2019-02-13 12:37:35 -0500, Tom Lane wrote: >> Bleah. But in any case, the rename should not create a situation >> in which we need to fsync the file data again. > Well, it's not super well defined which of either you need to make the > rename durable, and it appears to differ between OSs. Any argument > against fixing it up like I suggested, by using an fd from before the > rename? I'm unimpressed. You're speculating about the filesystem having random deviations from POSIX behavior, and using that weak argument to justify a totally untested technique having its own obvious portability hazards. Who's to say that an fsync on a file opened before a rename is going to do anything good after the rename? (On, eg, NFS there are obvious reasons why it might not.) Also, I wondered why this is coming out as a PANIC. I thought originally that somebody must be causing this code to run in a critical section, but it looks like the real issue is just that fsync_fname() uses data_sync_elevel, which is int data_sync_elevel(int elevel) { return data_sync_retry ? elevel : PANIC; } I really really don't want us doing questionably-necessary fsyncs with a PANIC as the result. Perhaps more to the point, the way this was coded, the PANIC applies to open() failures in fsync_fname_ext() not just fsync() failures; that's painting with too broad a brush isn't it? regards, tom lane