Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1h33N3-000349-PK for pgsql-hackers@arkaria.postgresql.org; Sun, 10 Mar 2019 18:40:26 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.89) (envelope-from ) id 1h33N1-0002D5-G8 for pgsql-hackers@arkaria.postgresql.org; Sun, 10 Mar 2019 18:40:23 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1h33N1-0002Cv-4y for pgsql-hackers@lists.postgresql.org; Sun, 10 Mar 2019 18:40:23 +0000 Received: from mail-oi1-x241.google.com ([2607:f8b0:4864:20::241]) by magus.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1h33Mx-0004OC-Vz for pgsql-hackers@lists.postgresql.org; Sun, 10 Mar 2019 18:40:22 +0000 Received: by mail-oi1-x241.google.com with SMTP id i8so1929504oib.10 for ; Sun, 10 Mar 2019 11:40:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=2ndquadrant-com.20150623.gappssmtp.com; s=20150623; h=from:subject:to:cc:references:openpgp:autocrypt:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding :content-language; bh=VPAvCJ3mDGjE7OEEqAB/N1fdv+hUT+ppRd2LiPXUv90=; b=RcG1d3aAll1HFRJMld9DfWPhbIKMDII4k5yv4pZn0yjJUk8kpMeDuZaHoepr7WwLjw shsrc+I1icx399BGJ8wvI1dNMimTIlI13EfRFv3bikkCzYVLdg8+APPH2XpABN96dYfI aou3gieyw5P38upl8+eIlhJ6PlKwE9f6Q1DuwSlEpM+Q2WShabrMLd/EDQ9mh4vc1Sn+ /jrXGyFeyD4ZuTRURyOlf0T3W+3y4loAgA6rK0aB07SHOj3mJ3ZKhTuMnzyOt1QaywBv SvN7MNhw/9xBycdI5n3WbF8cJOuHGPkGnSZi/8RTHciklr7m6b+X4s717woXRWeDncr+ P8ww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:subject:to:cc:references:openpgp:autocrypt :message-id:date:user-agent:mime-version:in-reply-to :content-transfer-encoding:content-language; bh=VPAvCJ3mDGjE7OEEqAB/N1fdv+hUT+ppRd2LiPXUv90=; b=hlkXX11STCdvys8wlP6t4H9GteRXjew7vXc1Rxqb5EfbLZTX2QOTuizQePaXeAYGaZ P25jKty+zLduDs/X1TPc4hPVJxdm40X3WZ+n4Z+jMOnDvERVOBE8Jjp+EHmhy/pg1YNU hzIivJVpyOw7phvxCVWNBk7AVZEiQnmmLaAOUiOSzoXc9bb3fioPeZw+6xEvP5ndX0/X Oerx/cebET7sYvSpUnKrYuf3ulrHqsMwZ118pJVnc5wI3EHOxzM04bajd6lpOFtz9uvG dPCz/EiOaNSms1LdrdBiHgDTslyQJS9N9noq56xe2dwuwxlLfvoR+6MIZPloNn6/RTI2 KJdw== X-Gm-Message-State: APjAAAV65TwdTFqxohmTvCoEi82uqtEokjgE0EZej7Hx91K5umRfnGTC DSG8HmKN7tbYaCHHV40eHNk3FtndzuDZooXVpokGuI0Z3VJOghulyDrOnrzODXWBLrO6RN9F1We FRSCkcacuCc64vpBnrca1KgyjTMb1q+8ND4BEstG+XOWHZWEr4zZTLCbyR4NmaSD7laiQxERW3r vDlY9xoWMLnIDK6ymwdtvW X-Google-Smtp-Source: APXvYqyvyYSYPz8Hbs5seUObcRUU5cNiVkCCLeATpIZ+vyl4ncrcWXWNVXDw06lzBuFAH3Wce3p4hg== X-Received: by 2002:a54:4891:: with SMTP id r17mr14695622oic.7.1552243217193; Sun, 10 Mar 2019 11:40:17 -0700 (PDT) Received: from [192.168.10.146] (99-10-92-30.lightspeed.rlghnc.sbcglobal.net. [99.10.92.30]) by smtp.gmail.com with ESMTPSA id q67sm1461859oif.40.2019.03.10.11.40.16 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 10 Mar 2019 11:40:16 -0700 (PDT) From: Andrew Dunstan X-Google-Original-From: Andrew Dunstan Subject: Re: subscriptionCheck failures on nightjar To: Andres Freund , Tom Lane Cc: Thomas Munro , PostgreSQL Hackers References: <17827.1549866683@sss.pgh.pa.us> <27965.1550077052@sss.pgh.pa.us> <20190213171101.6wpz7tardp3t3uvk@alap3.anarazel.de> <29708.1550079455@sss.pgh.pa.us> <20190213174151.mfylkessxmapt4io@alap3.anarazel.de> <30608.1550080759@sss.pgh.pa.us> <20190213181225.fathyapig4sm4exa@alap3.anarazel.de> Openpgp: preference=signencrypt Autocrypt: addr=andrew.dunstan@2ndQuadrant.com; prefer-encrypt=mutual; keydata= xsBNBE7KWFkBCAClridxur2AIc7eW2AR7izbfp3EnNefie2HbLF0izW5Ik5UjX2HBXBx4syI gY6b0ugohXrr274+baoAlvSbq6cAoQuEVrk5IZFzt20b1Xkx65FwGSEj526yiKLocqkJceSq Xr9xcA5SGY+FZv441chh5SU92v4q6z+6LPpoHOh97ptAVXZYNTtU0LevyvD5lja0TzbvJm6C eFXitJfnm1pLEr0DGJCR/iUOl/N62Kh4855zZC7NHIjQHPOvV5Stz/l5ilDhvGVk+xkXFPys SjZoUr1rXhYLpiyi5sR0X9FHXT0KnGuz1F5ERO7ZTLSSQ6fJwPj6gOk9K+vvoKvoeql5ABEB AAHNL0FuZHJldyBEdW5zdGFuIDxhbmRyZXcuZHVuc3RhbkAybmRRdWFkcmFudC5jb20+wsCY BBMBAgBCAhsDBgsJCAcDAgYVCAIJCgsEFgIDAQIeAQIXgAUJDyV6hBYhBOQ+WEYd/Hy/RGkV pZn6f8tZ/DuBBQJcH/EVAhkBAAoJEJn6f8tZ/DuB7BEH/iDWZEbVft2m2MT6SPjARzTupt/l xLLZ/Cl6n2US2hov5+fBbnR5xvECk/YCOfG9ICvKNlpFjceqNLRKStd8cmndnI+kDw91uNM2 hpL4aJMeW65XCHvqXlOLSwv5chBpXz9NG7FMaJ+q+59sWEWjcTNmjXgkEHn53y+1pc7nSLDe cr5ZI1aB3nGXRm1SoDY5PYZkUKMykW+YBZwgcs/Q5tRaE+siN7DKbz8E628+PerwAkmHPVXN MdJFD6gs4S0gDFA8IxcyIwZU2v4By+brzcJ9Se3FK5BYxBDHtaYZxgSrJAvOUOrLIZPQTQfL TtnVs3C3JTfrxCH1ceHtnA6dRuTOwE0ETspYWQEIANGc4zQULOxhbqO2dyD51YhqCNRmm9oK Waqf+wmW4tpDe/VVcxAnNizd4LWCHfzpb5cHAtGkOPePMfzWVf6nvdF7d3eglbtf59+zG7O7 llV0xSSoFiieQBsrGvqDInXYX/4mRRXMtyhM353/tixC9RWLs1oofyYmCPPXXY7h9R7en3B8 BoVrRFcdzlIY/NFNhFGW/9dkEiGjgna2Rk6e15kln4ZvFBWUg23p93w/pqXcxY6+k/8TEk+C 4R+M6w7o2PLGOjdZ+kPiUcw5H85zf/yZJwQXzisXaNduwWB6Vads9YC9dj6kPR1c4VGRqAaY L++LAEOqrlvm2TvqQqZRtnEAEQEAAcLAZQQYAQIADwIbDAUCXA6fpAUJDyV6xAAKCRCZ+n/L Wfw7geU4B/9vFFsVmWxerDWgNjyiK80zIPmlw/iwdUlU6FIu0JWLMg4093kvxBs7RMZV9hBq n3ekBvD4IvFsVLJrw9qzSI2FH3UeYy5WXpEs01Ppia60NOlvRDqXbryzuvfe+iYo6bVP3Ikr TPTNYL/BePUTD2DPvSI82SbGTyzuA7SJv6NRqqQDMIy1fU6KyQFLmQJp/QJyhKWzm9jWh8ee w8jVaoMY0Kbq5N7RqisEx1Lu/Iw646UphdYzQNymf+kbiU5Kf4I6DhzV8w9sie1Bo1juNj3I c5/fFJsmxljbnaDbXPcnFsur1Oo9mIHFakgWgOBMuxoWfjONiLjjLh/9+ZIqSSLX Message-ID: Date: Sun, 10 Mar 2019 14:40:15 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.3.0 MIME-Version: 1.0 In-Reply-To: <20190213181225.fathyapig4sm4exa@alap3.anarazel.de> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Content-Language: en-MW List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Precedence: bulk On 2/13/19 1:12 PM, Andres Freund wrote: > Hi, > > On 2019-02-13 12:59:19 -0500, Tom Lane wrote: >> Andres Freund writes: >>> On 2019-02-13 12:37:35 -0500, Tom Lane wrote: >>>> Bleah. But in any case, the rename should not create a situation >>>> in which we need to fsync the file data again. >>> Well, it's not super well defined which of either you need to make the >>> rename durable, and it appears to differ between OSs. Any argument >>> against fixing it up like I suggested, by using an fd from before the >>> rename? >> I'm unimpressed. You're speculating about the filesystem having random >> deviations from POSIX behavior, and using that weak argument to justify a >> totally untested technique having its own obvious portability hazards. > Uhm, we've reproduced failures due to the lack of such fsyncs at some > point. And not some fringe OS, but ext4 (albeit with data=writeback). > > I don't think POSIX has yet figured out what they actually think is > required: > http://austingroupbugs.net/view.php?id=672 > > I guess we could just ignore ENOENT in this case, that ought to be just > as safe as using the old fd. > > >> Also, I wondered why this is coming out as a PANIC. I thought originally >> that somebody must be causing this code to run in a critical section, >> but it looks like the real issue is just that fsync_fname() uses >> data_sync_elevel, which is >> >> int >> data_sync_elevel(int elevel) >> { >> return data_sync_retry ? elevel : PANIC; >> } >> >> I really really don't want us doing questionably-necessary fsyncs with a >> PANIC as the result. > Well, given the 'failed fsync throws dirty data away' issue, I don't > quite see what we can do otherwise. But: > > >> Perhaps more to the point, the way this was coded, the PANIC applies >> to open() failures in fsync_fname_ext() not just fsync() failures; >> that's painting with too broad a brush isn't it? > That indeed seems wrong. Thomas? I'm not quite sure how to best fix > this though - I guess we could rename fsync_fname_ext's eleval parameter > to fsync_failure_elevel? It's not visible outside fd.c, so that'd not be > to bad? > Thread seems to have gone quiet ... cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services