Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1iAi1v-0004F2-3L for pgsql-hackers@arkaria.postgresql.org; Wed, 18 Sep 2019 22:02:31 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.89) (envelope-from ) id 1iAi1t-0006Qz-TG for pgsql-hackers@arkaria.postgresql.org; Wed, 18 Sep 2019 22:02:29 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1iAhyo-00041k-Vp for pgsql-hackers@lists.postgresql.org; Wed, 18 Sep 2019 21:59:19 +0000 Received: from mail-wm1-x342.google.com ([2a00:1450:4864:20::342]) by makus.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1iAhxk-0006UB-1J for pgsql-hackers@lists.postgresql.org; Wed, 18 Sep 2019 21:58:13 +0000 Received: by mail-wm1-x342.google.com with SMTP id 5so1886428wmg.0 for ; Wed, 18 Sep 2019 14:58:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=2ndquadrant-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=HP7MqPFmvCIEf2+W3HK732UeMp250Btb4fkSrnweNMg=; b=j/Sld0pgLiF+BCee/WXRxfxk+Z++ruiyvCURlX5JugFwhGEKly+2GEt8+cUfG+S796 pZw/h7BLtruXerE3kxmuf9eshvpJ4Ex8FvqTY3rue6L9VxkTkFjMP/pvkDklGLYLbX3b qL7nmOHEyehkg+GkJwBiQYAIvpl64rV6ZDa9hwyz49hdote3kiiF9iCiT/I7cwiD/VmP eXAKGx2iNNj7Up6UCxWwmCKGh2BNiF4MAbKY2qz2LbvTvdGK6XHnmDnw+n5DTO/7vEdf yK0S6y7HWLdy7UIYQq09pmBiTwi3JvrZ/KlA6FLAVa850+cTDjeA1rTm3aw0cvtKqlWS UTtw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=HP7MqPFmvCIEf2+W3HK732UeMp250Btb4fkSrnweNMg=; b=QaU3keORjaiz1z2pnsez5IFe4mfXP9nY2jawcWna0ylBzmz+yCv9Cp/HK2NuLIQl0V 7vib5NHCM/RbQygNogKYx0WJqOkNXf+Sh5gnuHf3CGThXeAkUYDg3bL9CrEAM4rp7hcm 6UWwmnBSRXKv1kUFf9kp7IIZk7pAh35XGZHg/sgcrG0yGmqAr+XKSzUwpqdDWfP3Y4dl VDT79ESXFJ8947LS4PzhxNv0GzkrEyleu3j9XehNZ3RkuWfhAHiIKFupG9coYd5CiQ8I wotuluvCC4WbCTEh4B6ntjSZ1fhYSTvjOOpSjdbp5KeqSpE+EGtVHmcGPgdfN97TGf9Z dG5Q== X-Gm-Message-State: APjAAAXN8Nj95kwBZKfTV5d8Ird8IcG8ADW2m2Z8+yZrEev0EWxMp6UJ SjCYW5Mxp9RvLJITJKL/RF32kQ== X-Google-Smtp-Source: APXvYqzccEpWwX8hC/lmMSUFI9hbKmF7jeP+u1S5UmLMBrTTCyMuDflJhz94F2ON6k3GzNRf98xtZg== X-Received: by 2002:a05:600c:291c:: with SMTP id i28mr65905wmd.98.1568843890630; Wed, 18 Sep 2019 14:58:10 -0700 (PDT) Received: from localhost (ip-86-49-253-183.net.upcbroadband.cz. [86.49.253.183]) by smtp.gmail.com with ESMTPSA id z189sm5704173wmc.25.2019.09.18.14.58.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 Sep 2019 14:58:09 -0700 (PDT) Date: Wed, 18 Sep 2019 23:58:08 +0200 From: Tomas Vondra To: Kuntal Ghosh Cc: Michael Paquier , Tom Lane , Robert Haas , Andres Freund , Thomas Munro , PostgreSQL Hackers Subject: Re: subscriptionCheck failures on nightjar Message-ID: <20190918215808.yonxqgycme6pbctp@development> References: <1466.1550085086@sss.pgh.pa.us> <20190213215147.cjbymfojf6xndr4t@alap3.anarazel.de> <20190813080435.GL2551@paquier.xyz> <20190826132904.3ayuw36qzl2c4ktr@development> <7361.1568738373@sss.pgh.pa.us> <20190917194510.iqwyl3be62pz7l27@development> <20190918005815.GB8909@paquier.xyz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: User-Agent: NeoMutt/20180716 List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Precedence: bulk On Wed, Sep 18, 2019 at 04:25:14PM +0530, Kuntal Ghosh wrote: >Hello Michael, > >On Wed, Sep 18, 2019 at 6:28 AM Michael Paquier wrote: >> >> On my side, I have let this thing run for a couple of hours with a >> patched version to include a sleep between the rename and the sync but >> I could not reproduce it either: >> #!/bin/bash >> attempt=0 >> while true; do >> attempt=$((attempt+1)) >> echo "Attempt $attempt" >> cd $HOME/postgres/src/test/recovery/ >> PROVE_TESTS=t/006_logical_decoding.pl make check > /dev/null 2>&1 >> ERRNUM=$? >> if [ $ERRNUM != 0 ]; then >> echo "Failed at attempt $attempt" >> exit $ERRNUM >> fi >> done >I think the failing test is src/test/subscription/t/010_truncate.pl. >I've tried to reproduce the same failure using your script in OS X >10.14 and Ubuntu 18.04.2 (Linux version 5.0.0-23-generic), but >couldn't reproduce the same. > I kinda suspect it might be just a coincidence that it fails during that particular test. What likely plays a role here is a checkpoint timing (AFAICS that's the thing removing the file). On most systems the tests complete before any checkpoint is triggered, hence no issue. Maybe aggressively triggering checkpoints on the running cluter from another session would do the trick ... regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services