public inbox for [email protected]  
help / color / mirror / Atom feed
From: Tomas Vondra <[email protected]>
To: Kuntal Ghosh <[email protected]>
Cc: Michael Paquier <[email protected]>
Cc: Tom Lane <[email protected]>
Cc: Robert Haas <[email protected]>
Cc: Andres Freund <[email protected]>
Cc: Thomas Munro <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Subject: Re: subscriptionCheck failures on nightjar
Date: Wed, 18 Sep 2019 23:58:08 +0200
Message-ID: <20190918215808.yonxqgycme6pbctp@development> (raw)
In-Reply-To: <CAGz5QCJv5JbRDsATDTkJqq7h9F7u0QLnNnLHfxR1nEOa4DnkJQ@mail.gmail.com>
References: <[email protected]>
	<CAEepm=0wB7vgztC5sg2nmJ-H3bnrBT5GQfhUzP+Ffq-WT3g8VA@mail.gmail.com>
	<[email protected]>
	<[email protected]>
	<20190826132904.3ayuw36qzl2c4ktr@development>
	<CA+TgmoaNOMG9+Ho9d3CX+-10O7+nqqvmSpXb1m0F3dqWB4C-8g@mail.gmail.com>
	<[email protected]>
	<20190917194510.iqwyl3be62pz7l27@development>
	<[email protected]>
	<CAGz5QCJv5JbRDsATDTkJqq7h9F7u0QLnNnLHfxR1nEOa4DnkJQ@mail.gmail.com>

On Wed, Sep 18, 2019 at 04:25:14PM +0530, Kuntal Ghosh wrote:
>Hello Michael,
>
>On Wed, Sep 18, 2019 at 6:28 AM Michael Paquier <[email protected]> wrote:
>>
>> On my side, I have let this thing run for a couple of hours with a
>> patched version to include a sleep between the rename and the sync but
>> I could not reproduce it either:
>> #!/bin/bash
>> attempt=0
>> while true; do
>>         attempt=$((attempt+1))
>>         echo "Attempt $attempt"
>>         cd $HOME/postgres/src/test/recovery/
>>         PROVE_TESTS=t/006_logical_decoding.pl make check > /dev/null 2>&1
>>         ERRNUM=$?
>>         if [ $ERRNUM != 0 ]; then
>>                 echo "Failed at attempt $attempt"
>>                 exit $ERRNUM
>>         fi
>> done
>I think the failing test is src/test/subscription/t/010_truncate.pl.
>I've tried to reproduce the same failure using your script in OS X
>10.14 and Ubuntu 18.04.2 (Linux version 5.0.0-23-generic), but
>couldn't reproduce the same.
>

I kinda suspect it might be just a coincidence that it fails during that
particular test. What likely plays a role here is a checkpoint timing
(AFAICS that's the thing removing the file).  On most systems the tests
complete before any checkpoint is triggered, hence no issue.

Maybe aggressively triggering checkpoints on the running cluter from
another session would do the trick ...

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 





view thread (44+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: subscriptionCheck failures on nightjar
  In-Reply-To: <20190918215808.yonxqgycme6pbctp@development>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox