Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1gty3x-0000AG-DQ for pgsql-hackers@arkaria.postgresql.org; Wed, 13 Feb 2019 17:11:09 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.89) (envelope-from ) id 1gty3v-0007aP-Uv for pgsql-hackers@arkaria.postgresql.org; Wed, 13 Feb 2019 17:11:07 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1gty3v-0007aI-Nq for pgsql-hackers@lists.postgresql.org; Wed, 13 Feb 2019 17:11:07 +0000 Received: from new1-smtp.messagingengine.com ([66.111.4.221]) by magus.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1gty3t-00081i-Q5 for pgsql-hackers@lists.postgresql.org; Wed, 13 Feb 2019 17:11:07 +0000 Received: from compute5.internal (compute5.nyi.internal [10.202.2.45]) by mailnew.nyi.internal (Postfix) with ESMTP id 91E7316DE9; Wed, 13 Feb 2019 12:11:04 -0500 (EST) Received: from mailfrontend1 ([10.202.2.162]) by compute5.internal (MEProxy); Wed, 13 Feb 2019 12:11:04 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=anarazel.de; h= date:from:to:cc:subject:message-id:references:mime-version :content-type:in-reply-to; s=fm3; bh=f10J5p0PP1HElkcNw02I1ZNYLF2 kQgQh0SabH1Fgzkk=; b=Ac4cDdcgvNCh/joy9+Sln1rGkvWdiuWtf2mYs5MBW24 fXPXh4w5sSpsoSJYhtTNCT8i23S3eEMd4oocokRg1TbqlkQRehmU3YnahX9909kc tRvItjoYmQTlr4vq0GEXqOrT0545FoxG6yWensCWhiRS25g+bq1swEvTa2fYDIi4 4LhQc85XD78i6eqLa/pO6dBSHsGe6l2uxWv0OX05uKvedJNNPxIaPqslFvzeukRv dfOPER+qSsxTM6WgLDv9e60xyv7I908bSWwSFXkBqofcPaWvVtBiHCTbXI2Yx3AU eKEIEUA5jbXUEQhkS/+taRzkrul8DKvOUq413na7WoA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to:x-me-proxy :x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm2; bh=f10J5p 0PP1HElkcNw02I1ZNYLF2kQgQh0SabH1Fgzkk=; b=tsmUa/NYkdWFuKzVj95rXp 3bi0POdxz45PzU5oYtdmb9bHh/spJ9PuHNNwFxxxWf8QDro44Y9K66yaEciWd6Jl eaPZwUIMU2K+RMYA/tBy0Brau0p3CnCV1GO9b5/pJQ0L7rOzqRdmnQSH0McoVSeY VSKSwjD554e/S84+LzDcWCmBQ9Tu7O0Dmm7AZvLD3yUET4al60UGytX7KevA9Cgd zxlMm6/b1/DPA3bK11TSYiw6me4qopAVhhFXg0aKQseUKCpVWYxaf1AIIkjIvKsw 58XBla1R9ROSkHNWAGRk47IJS+HZko1aB8qrQhA5vDGAVpkPgT2BYC2DJ8QIPb8g == X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedtledruddtfedgleekucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfquhhtnecuuegrihhlohhuthemucef tddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenucfjughrpeffhffvuffkfh ggtggujgesthdtredttddtvdenucfhrhhomheptehnughrvghsucfhrhgvuhhnugcuoegr nhgurhgvshesrghnrghrrgiivghlrdguvgeqnecukfhppeekkedruddvkedrkedurdduvd eknecurfgrrhgrmhepmhgrihhlfhhrohhmpegrnhgurhgvshesrghnrghrrgiivghlrdgu vgenucevlhhushhtvghrufhiiigvpedt X-ME-Proxy: Received: from intern.anarazel.de (unknown [88.128.81.128]) by mail.messagingengine.com (Postfix) with ESMTPA id DAA5DE46AD; Wed, 13 Feb 2019 12:11:03 -0500 (EST) Date: Wed, 13 Feb 2019 09:11:01 -0800 From: Andres Freund To: Tom Lane Cc: Thomas Munro , PostgreSQL Hackers Subject: Re: subscriptionCheck failures on nightjar Message-ID: <20190213171101.6wpz7tardp3t3uvk@alap3.anarazel.de> References: <17827.1549866683@sss.pgh.pa.us> <27965.1550077052@sss.pgh.pa.us> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <27965.1550077052@sss.pgh.pa.us> List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Precedence: bulk Hi, On 2019-02-13 11:57:32 -0500, Tom Lane wrote: > I've managed to reproduce this locally, and obtained this PANIC: Cool. How exactly? Nice catch. > Anyway, I think we might be able to fix this along the lines of > > CloseTransientFile(fd); > > + /* ensure snapshot file is down to stable storage */ > + fsync_fname(tmppath, false); > fsync_fname("pg_logical/snapshots", true); > > /* > * We may overwrite the work from some other backend, but that's ok, our > * snapshot is valid as well, we'll just have done some superfluous work. > */ > if (rename(tmppath, path) != 0) > { > ereport(ERROR, > (errcode_for_file_access(), > errmsg("could not rename file \"%s\" to \"%s\": %m", > tmppath, path))); > } > > - /* make sure we persist */ > - fsync_fname(path, false); > + /* ensure we persist the file rename */ > fsync_fname("pg_logical/snapshots", true); Hm, but that's not the same? On some filesystems one needs the directory fsync, on some the file fsync, and I think both in some cases. ISTM we should just open the file before the rename, and then use fsync() on the filehandle rather than the filename. Then a concurrent rename ought not to hurt us? > The existing code here seems simply wacky/unsafe to me regardless > of this race condition: couldn't it potentially result in a corrupt > snapshot file appearing with a valid name, if the system crashes > after persisting the rename but before it's pushed the data out? What do you mean precisely with "before it's pushed the data out"? > I also wonder why bother with the directory sync just before the > rename. Because on some FS/OS combinations the size of the renamed-into-place file isn't guaranteed to be durable unless the directory was fsynced. Otherwise there's the possibility to have incomplete data if we crash after renaming, but before fsyncing. Greetings, Andres Freund