Received: from maia.hub.org (unknown [200.46.208.211]) by mail.postgresql.org (Postfix) with ESMTP id 6DAC4632817 for ; Wed, 24 Mar 2010 20:23:25 -0300 (ADT) Received: from mail.postgresql.org ([200.46.204.86]) by maia.hub.org (mx1.hub.org [200.46.208.211]) (amavisd-maia, port 10024) with ESMTP id 78338-06 for ; Wed, 24 Mar 2010 23:23:07 +0000 (UTC) X-Greylist: from auto-whitelisted by SQLgrey-1.7.6 Received: from outmail148108.authsmtp.net (outmail148108.authsmtp.net [62.13.148.108]) by mail.postgresql.org (Postfix) with ESMTP id 4CD7463234F for ; Wed, 24 Mar 2010 20:23:14 -0300 (ADT) Received: from mail-c187.authsmtp.com (mail-c187.authsmtp.com [62.13.128.33]) by punt5.authsmtp.com (8.14.2/8.14.2/Kp) with ESMTP id o2ONN50b013170; Wed, 24 Mar 2010 23:23:05 GMT Received: from [192.168.0.4] (88-110-151-22.dynamic.dsl.as9105.com [88.110.151.22]) (authenticated bits=0) by mail.authsmtp.com (8.14.2/8.14.2/Kp) with ESMTP id o2ONN2ob049563; Wed, 24 Mar 2010 23:23:02 GMT Subject: Re: Re: [COMMITTERS] pgsql: Make standby server continuously retry restoring the next WAL From: Simon Riggs To: Heikki Linnakangas Cc: Fujii Masao , Aidan Van Dyk , PostgreSQL-development In-Reply-To: <4BAA060A.2020000@enterprisedb.com> References: <3f0b79eb1002092105r21e009d3v468496058ba04392@mail.gmail.com> <20100211140118.GB14128@oak.highrise.ca> <4B74118C.30704@enterprisedb.com> <20100211144204.GC14128@oak.highrise.ca> <4B743E7D.5070603@enterprisedb.com> <3f0b79eb1002180337t1fab1395ve3491256672af15f@mail.gmail.com> <4BA0B079.3050301@enterprisedb.com> <3f0b79eb1003180727g7877743eq81274e014fe70a49@mail.gmail.com> <1268988724.3556.3.camel@ebony> <4BA361E4.7020309@enterprisedb.com> <3f0b79eb1003230017v16f4ecbeyc20e75beeffe8f1c@mail.gmail.com> <4BAA060A.2020000@enterprisedb.com> Content-Type: text/plain Date: Wed, 24 Mar 2010 23:23:01 +0000 Message-Id: <1269472981.8481.8946.camel@ebony> Mime-Version: 1.0 X-Mailer: Evolution 2.26.1 Content-Transfer-Encoding: 7bit X-Server-Quench: 318f1aba-379c-11df-ab46-001185d377ca X-AuthReport-Spam: If SPAM / abuse - report it at: http://www.authsmtp.com/abuse X-AuthRoute: OCdxZQATClZOTQEd DAteCiN5VAwpPBRK HVkIKg5MJUcNSQVJ NksachtFagFbYFhD HGQLWlREUl57W2B/ awkfZQ1DY0tOQQRv UVZLQE1XHAJ3AVJe BGFgLBstJQVCcXt2 ZghjXnReWgp5dE4r Qx8GRHAFYm4zdWEe BBZFJlFQdh5Kfh5E YlUrV3QKMjRJBC9q VzwTFhsSEA9kHWxv T1NFHnk1ZGMqIgIR fSs3VS0gBlQBFW0W Jh8rYkUAFUAdOFR6 KlY7R18Ce31aAQpY A0BLHShEPF0QDzAm ERlLFVQTCyFaWyZa DVU0IhIABDFCRmJn LW8t X-Authentic-SMTP: 61633235383639.1000:706/Kp X-AuthFastPath: 255 X-Virus-Status: No virus detected - but ensure you scan with your own anti-virus system. X-Virus-Scanned: Maia Mailguard 1.0.1 X-Spam-Status: No, hits=-2.411 tagged_above=-10 required=5 tests=AWL=0.188, BAYES_00=-2.599 X-Spam-Level: X-Archive-Number: 201003/995 X-Sequence-Number: 159771 On Wed, 2010-03-24 at 14:31 +0200, Heikki Linnakangas wrote: > Fujii Masao wrote: > > But in the current (v8.4 or before) behavior, recovery ends normally > > when an invalid record is found in an archived WAL file. Otherwise, > > the server would never be able to start normal processing when there > > is a corrupted archived file for some reasons. So, that invalid record > > should not be treated as a PANIC if the server is not in standby mode > > or the trigger file has been created. Thought? > > Hmm, true, this changes behavior over previous releases. I tend to think > that it's always an error if there's a corrupt file in the archive, > though, and PANIC is appropriate. If the administrator wants to start up > the database anyway, he can remove the corrupt file from the archive and > place it directly in pg_xlog instead. I don't agree with changing the behaviour from previous releases. PANICing won't change the situation, so it just destroys server availability. If we had 1 master and 42 slaves then this behaviour would take down almost the whole server farm at once. Very uncool. You might have reason to prevent the server starting up at that point, when in standby mode, but that is not a reason to PANIC. We don't really want all of the standbys thinking they can be the master all at once either. Better to throw a serious ERROR and have the server still up and available for reads. -- Simon Riggs www.2ndQuadrant.com