Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1gtz1g-0003HG-Qj for pgsql-hackers@arkaria.postgresql.org; Wed, 13 Feb 2019 18:12:54 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.89) (envelope-from ) id 1gtz1V-0005yY-F3 for pgsql-hackers@arkaria.postgresql.org; Wed, 13 Feb 2019 18:12:41 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1gtz1O-0005pi-UZ for pgsql-hackers@lists.postgresql.org; Wed, 13 Feb 2019 18:12:41 +0000 Received: from new1-smtp.messagingengine.com ([66.111.4.221]) by magus.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1gtz1K-00012u-Dv for pgsql-hackers@lists.postgresql.org; Wed, 13 Feb 2019 18:12:34 +0000 Received: from compute5.internal (compute5.nyi.internal [10.202.2.45]) by mailnew.nyi.internal (Postfix) with ESMTP id 0735A4EA9; Wed, 13 Feb 2019 13:12:28 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute5.internal (MEProxy); Wed, 13 Feb 2019 13:12:28 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=anarazel.de; h= date:from:to:cc:subject:message-id:references:mime-version :content-type:in-reply-to; s=fm3; bh=pnCuOXPgUz5ml1PLq1dKjQ2zAff Txy1nTe9nzpZHBMg=; b=GSRQuXNe03GWJ1KFC7u1Dkz43/uGqMP2v80Zya2N0St qkWRUo6SDHMgOilObjBUGEAUoa57k/ilwtml7MP5lIBh9t7sG0uBDob6nSXVr+RB z2hatQSQABrBrgRxXK36dPoXXrZwEgyxlY7raRF53p8fF2NxivnqXAaIMmilWLxf LOa5ZC/fkruLEbdYeekGzqpecZVbBjqigghMek0lK5vmBA8O5KcSLIVUOedO8Eze 0VW9GEvgLRZSmeLxp9wEOoLY5sXHhZkCbxGTsjdZ/iDEaddvEqAGQ1EkXlZR3eG4 e8OeuJDSk1puHrjDtWmMlYn4fKP2ycu1cxg4/9DHnyg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to:x-me-proxy :x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm2; bh=pnCuOX PgUz5ml1PLq1dKjQ2zAffTxy1nTe9nzpZHBMg=; b=5Sm2fuFT/4mA+1hguRYoao Z1O0DrTNlWYP0Q1Monu7WPukxGNBuo3bFVdIu4yDT8UjICf3xs0ctFrAI4OQY19h yjhOYn8U5I+s/HTybLeS1OYQ/c2DitTBIr8n03ADBR0cM/0y9Kg4a7SkFbhi+Hx8 AIfKLW3shdMNNv7sW9iq7BThEGRKVr0be1OJ67AtZcEnrRTbboqMxErH+e4Wxfmz 1N091t+8DFSGyhS3wxGGDTxUVlem0EwBOMWhyLosyvtBK2R5PIVZkdFjjE/pDvTZ v68UocUqhPnfPK9F+JoqeHGsO5iAaAYUfdY98GxXTX+v86pO5cdkiHkI8lgNj59w == X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedtledruddtfedguddutdcutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfhuthenuceurghilhhouhhtmecu fedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujfgurhepfffhvffukf hfgggtuggjsehttdertddttddvnecuhfhrohhmpeetnhgurhgvshcuhfhrvghunhguuceo rghnughrvghssegrnhgrrhgriigvlhdruggvqeenucffohhmrghinheprghushhtihhngh hrohhuphgsuhhgshdrnhgvthenucfkphepkeekrdduvdekrdekuddruddvkeenucfrrghr rghmpehmrghilhhfrhhomheprghnughrvghssegrnhgrrhgriigvlhdruggvnecuvehluh hsthgvrhfuihiivgeptd X-ME-Proxy: Received: from intern.anarazel.de (unknown [88.128.81.128]) by mail.messagingengine.com (Postfix) with ESMTPA id 110631030F; Wed, 13 Feb 2019 13:12:27 -0500 (EST) Date: Wed, 13 Feb 2019 10:12:25 -0800 From: Andres Freund To: Tom Lane Cc: Thomas Munro , PostgreSQL Hackers Subject: Re: subscriptionCheck failures on nightjar Message-ID: <20190213181225.fathyapig4sm4exa@alap3.anarazel.de> References: <17827.1549866683@sss.pgh.pa.us> <27965.1550077052@sss.pgh.pa.us> <20190213171101.6wpz7tardp3t3uvk@alap3.anarazel.de> <29708.1550079455@sss.pgh.pa.us> <20190213174151.mfylkessxmapt4io@alap3.anarazel.de> <30608.1550080759@sss.pgh.pa.us> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <30608.1550080759@sss.pgh.pa.us> List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Precedence: bulk Hi, On 2019-02-13 12:59:19 -0500, Tom Lane wrote: > Andres Freund writes: > > On 2019-02-13 12:37:35 -0500, Tom Lane wrote: > >> Bleah. But in any case, the rename should not create a situation > >> in which we need to fsync the file data again. > > > Well, it's not super well defined which of either you need to make the > > rename durable, and it appears to differ between OSs. Any argument > > against fixing it up like I suggested, by using an fd from before the > > rename? > > I'm unimpressed. You're speculating about the filesystem having random > deviations from POSIX behavior, and using that weak argument to justify a > totally untested technique having its own obvious portability hazards. Uhm, we've reproduced failures due to the lack of such fsyncs at some point. And not some fringe OS, but ext4 (albeit with data=writeback). I don't think POSIX has yet figured out what they actually think is required: http://austingroupbugs.net/view.php?id=672 I guess we could just ignore ENOENT in this case, that ought to be just as safe as using the old fd. > Also, I wondered why this is coming out as a PANIC. I thought originally > that somebody must be causing this code to run in a critical section, > but it looks like the real issue is just that fsync_fname() uses > data_sync_elevel, which is > > int > data_sync_elevel(int elevel) > { > return data_sync_retry ? elevel : PANIC; > } > > I really really don't want us doing questionably-necessary fsyncs with a > PANIC as the result. Well, given the 'failed fsync throws dirty data away' issue, I don't quite see what we can do otherwise. But: > Perhaps more to the point, the way this was coded, the PANIC applies > to open() failures in fsync_fname_ext() not just fsync() failures; > that's painting with too broad a brush isn't it? That indeed seems wrong. Thomas? I'm not quite sure how to best fix this though - I guess we could rename fsync_fname_ext's eleval parameter to fsync_failure_elevel? It's not visible outside fd.c, so that'd not be to bad? Greetings, Andres Freund