Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1gtt9e-0006cE-LU for pgsql-hackers@arkaria.postgresql.org; Wed, 13 Feb 2019 11:56:42 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.89) (envelope-from ) id 1gtt9d-0008KW-E8 for pgsql-hackers@arkaria.postgresql.org; Wed, 13 Feb 2019 11:56:41 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1gtt9c-0008KP-W0 for pgsql-hackers@lists.postgresql.org; Wed, 13 Feb 2019 11:56:41 +0000 Received: from mail-wm1-x344.google.com ([2a00:1450:4864:20::344]) by makus.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1gtt9Z-00061E-U9 for pgsql-hackers@lists.postgresql.org; Wed, 13 Feb 2019 11:56:39 +0000 Received: by mail-wm1-x344.google.com with SMTP id r17so2119381wmh.5 for ; Wed, 13 Feb 2019 03:56:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=enterprisedb-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=GOQoLs84v1UMd1Cew+vYXbdGDZvqHnxy8EFIVggj1Hs=; b=MOVSLZdS2usjTdMNKrhN2ugjKHwRUydoUC7S/TnENWF4uPnMsOh2g2E09xtCGWHXjT plnzD/DuUVnt5UjTu5OdJamxSPHn+05EoJmomd/S0rntoYxIVG6igbCMsDV8mRv/7BSl Wvl6Kbyot4sH1Oe1CnSmNQhfY7x4MWJ1lLAAfCNegNJlrz1PH3eplEBjFBhlXz8n8tls MHZiPAZOUYlfKPTUQJvzZ7u2dXZ+Rnfd9ira548XIbJWwVfqH0VLKHY7ZL8NDV/YwN2I PBgi6CdRewjHHe1H5rw1Uot+P4MYIf3WSALF3e/Mwjpg/vFJuy+30jY6vxnWsQQCq85P zcyg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=GOQoLs84v1UMd1Cew+vYXbdGDZvqHnxy8EFIVggj1Hs=; b=S6kb63dzs/gsKCwOpkQPK53Xi5errDvWn1q6Xh95DF5JE6e5D4hKdqMJDSf/AnKHqd 0xgmd2h29WW3Xx0GNVCWiCtGzMdccj6UdNsBOwbP5TAfkQtZj8d80jATyY9YNAQYr039 KLTAbSUWzOTWydGtW+yEpC+nVCw6xRmS30BZy79dGhlluEZcckAemI2ZuGhVef1QmhQv uhPh27FFlLqcI06Et4+nKWQHQJ8FQBzl4dM7QzydU44fXYOl1cChBenSWGI9VQa9yhjm ioYh4xUoD9Vy37oO/0n2ViZ44f+9IEqWG6dpFAqR/0CH5tteUDeoHsx6VkpCqgZSqZLD zWrg== X-Gm-Message-State: AHQUAuZmPCUDaV2C1PNkqXX9OpwjBfR6pI3L+cBS+n+7KhsBBDnaDtXa PGhxO8U4HbdvX9Y/8dYY1ADdxZIk9hHXfmUzbBkleg== X-Google-Smtp-Source: AHgI3Ias1dRndtF03f7gZ3tgO9AFR1TOqAt9Ss2VAvKvJwdzG9F2aNsiaUY6kw4cUDeh9ldVUWajb/MvZD2+wq+qw/c= X-Received: by 2002:a1c:7dd6:: with SMTP id y205mr60269wmc.121.1550058995643; Wed, 13 Feb 2019 03:56:35 -0800 (PST) MIME-Version: 1.0 References: <17827.1549866683@sss.pgh.pa.us> In-Reply-To: <17827.1549866683@sss.pgh.pa.us> From: Thomas Munro Date: Thu, 14 Feb 2019 00:55:59 +1300 Message-ID: Subject: Re: subscriptionCheck failures on nightjar To: Tom Lane Cc: PostgreSQL Hackers Content-Type: text/plain; charset="UTF-8" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Precedence: bulk On Mon, Feb 11, 2019 at 7:31 PM Tom Lane wrote: > 2019-02-10 23:55:58.798 EST [40728] sub1 PANIC: could not open file "pg_logical/snapshots/0-160B578.snap": No such file or directory They get atomically renamed into place, which seems kosher even if snapshots for the same LSN are created concurrently by different backends (and tracing syscalls confirms that that does occasionally happen). It's hard to believe that nightjar's rename() ceased to be atomic a couple of months ago. It looks like the only way for files to get unlinked after that is by CheckPointSnapBuild() deciding they are too old. Hmm. Could this be relevant, and cause a well timed checkpoint to unlink files too soon? 2019-02-12 21:52:58.304 EST [22922] WARNING: out of logical replication worker slots -- Thomas Munro http://www.enterprisedb.com