Received: from maia.hub.org (unknown [200.46.204.183]) by mail.postgresql.org (Postfix) with ESMTP id BF09A634A31 for ; Thu, 25 Mar 2010 07:27:18 -0300 (ADT) Received: from mail.postgresql.org ([200.46.204.86]) by maia.hub.org (mx1.hub.org [200.46.204.183]) (amavisd-maia, port 10024) with ESMTP id 75094-01-4 for ; Thu, 25 Mar 2010 10:27:08 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from exprod7og116.obsmtp.com (exprod7og116.obsmtp.com [64.18.2.219]) by mail.postgresql.org (Postfix) with SMTP id F047A6343C1 for ; Thu, 25 Mar 2010 07:26:31 -0300 (ADT) Received: from source ([209.85.219.220]) by exprod7ob116.postini.com ([64.18.6.12]) with SMTP ID DSNKS6s6VjTbFdoHF/wbkrYrXIHGBpdTM3Ey@postini.com; Thu, 25 Mar 2010 03:26:31 PDT Received: by ewy20 with SMTP id 20so1397562ewy.21 for ; Thu, 25 Mar 2010 03:26:29 -0700 (PDT) Received: by 10.213.65.11 with SMTP id g11mr792574ebi.17.1269512789475; Thu, 25 Mar 2010 03:26:29 -0700 (PDT) Received: from [192.168.1.117] (dsl-hkibrasgw2-ff67c300-165.dhcp.inet.fi [88.195.103.165]) by mx.google.com with ESMTPS id 13sm672655ewy.13.2010.03.25.03.26.27 (version=TLSv1/SSLv3 cipher=RC4-MD5); Thu, 25 Mar 2010 03:26:28 -0700 (PDT) Message-ID: <4BAB3A51.5050707@enterprisedb.com> Date: Thu, 25 Mar 2010 12:26:25 +0200 From: Heikki Linnakangas Organization: EnterpriseDB User-Agent: Mozilla-Thunderbird 2.0.0.22 (X11/20090706) MIME-Version: 1.0 To: Simon Riggs CC: Tom Lane , Fujii Masao , Aidan Van Dyk , PostgreSQL-development Subject: Re: Re: [COMMITTERS] pgsql: Make standby server continuously retry restoring the next WAL References: <3f0b79eb1002092105r21e009d3v468496058ba04392@mail.gmail.com> <4B743E7D.5070603@enterprisedb.com> <3f0b79eb1002180337t1fab1395ve3491256672af15f@mail.gmail.com> <4BA0B079.3050301@enterprisedb.com> <3f0b79eb1003180727g7877743eq81274e014fe70a49@mail.gmail.com> <1268988724.3556.3.camel@ebony> <4BA361E4.7020309@enterprisedb.com> <3f0b79eb1003230017v16f4ecbeyc20e75beeffe8f1c@mail.gmail.com> <4BAA060A.2020000@enterprisedb.com> <1269472981.8481.8946.camel@ebony> <3f0b79eb1003241908n1e8f38e0q7cd7465163b3d7af@mail.gmail.com> <6198.1269483277@sss.pgh.pa.us> <4BAB1AC1.7000900@enterprisedb.com> <1269505427.8481.8978.camel@ebony> In-Reply-To: <1269505427.8481.8978.camel@ebony> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Virus-Scanned: Maia Mailguard 1.0.1 X-Spam-Status: No, hits=-2.599 tagged_above=-10 required=5 tests=BAYES_00=-2.599 X-Spam-Level: X-Archive-Number: 201003/1004 X-Sequence-Number: 159780 Simon Riggs wrote: > On Thu, 2010-03-25 at 10:11 +0200, Heikki Linnakangas wrote: > >> PANIC seems like the appropriate solution for now. > > It definitely is not. Think some more. Well, what happens now in previous versions with pg_standby et al is that the standby starts up. That doesn't seem appropriate either. Hmm, it would be trivial to just stay in the standby mode at a corrupt file, continuously retrying to restore it and continue replay. If it's genuinely corrupt, it will never succeed and the standby gets stuck at that point. Maybe that's better; it's close to what Fujii suggested except that you don't need a new mode for it. I'm worried that the administrator won't notice the error promptly because at a quick glance the server is up and running, while it's actually stuck at the error and falling indefinitely behind the master. Maybe if we make it a WARNING, that's enough to alleviate that. It's true that if the standby is actively being used for read-only queries, shutting it down to just get the administrators attention isn't good either. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com