Received: from maia.hub.org (unknown [200.46.208.211]) by mail.postgresql.org (Postfix) with ESMTP id 8F2E9632DFE for ; Fri, 12 Feb 2010 12:10:20 -0400 (AST) Received: from mail.postgresql.org ([200.46.204.86]) by maia.hub.org (mx1.hub.org [200.46.208.211]) (amavisd-maia, port 10024) with ESMTP id 54580-02 for ; Fri, 12 Feb 2010 16:10:09 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from exprod7og103.obsmtp.com (exprod7og103.obsmtp.com [64.18.2.159]) by mail.postgresql.org (Postfix) with SMTP id 48C766325D4 for ; Fri, 12 Feb 2010 12:10:09 -0400 (AST) Received: from source ([209.85.219.215]) by exprod7ob103.postini.com ([64.18.6.12]) with SMTP ID DSNKS3V9YMrKnThFlTOkCg2v+dN3ViLG7FCz@postini.com; Fri, 12 Feb 2010 08:10:09 PST Received: by mail-ew0-f215.google.com with SMTP id 7so2518300ewy.6 for ; Fri, 12 Feb 2010 08:10:08 -0800 (PST) Received: by 10.213.65.138 with SMTP id j10mr234344ebi.23.1265991007975; Fri, 12 Feb 2010 08:10:07 -0800 (PST) Received: from ?192.168.1.117? (dsl-hkibrasgw2-ff67c300-165.dhcp.inet.fi [88.195.103.165]) by mx.google.com with ESMTPS id 7sm8608934eyg.33.2010.02.12.08.10.06 (version=TLSv1/SSLv3 cipher=RC4-MD5); Fri, 12 Feb 2010 08:10:06 -0800 (PST) Message-ID: <4B757D5D.3070506@enterprisedb.com> Date: Fri, 12 Feb 2010 18:10:05 +0200 From: Heikki Linnakangas Organization: EnterpriseDB User-Agent: Mozilla-Thunderbird 2.0.0.22 (X11/20090706) MIME-Version: 1.0 To: Fujii Masao CC: Simon Riggs , Dimitri Fontaine , PostgreSQL-development Subject: Re: Re: [COMMITTERS] pgsql: Make standby server continuously retry restoring the next WAL References: <20100127152751.3B2047541B9@cvs.postgresql.org> <1265893599.7341.1454.camel@ebony> <877hqjc2kk.fsf@hi-media-techno.com> <1265896250.7341.1627.camel@ebony> <4B740C6C.3010607@enterprisedb.com> <1265897834.7341.1714.camel@ebony> <4B7412BE.5030605@enterprisedb.com> <3f0b79eb1002112138n61a3258fg9986e50751d44ea0@mail.gmail.com> <1265979080.7341.3679.camel@ebony> <4B75533D.2000703@enterprisedb.com> <3f0b79eb1002120747q3203bed6ue1bd07558ec2e38b@mail.gmail.com> In-Reply-To: <3f0b79eb1002120747q3203bed6ue1bd07558ec2e38b@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Virus-Scanned: Maia Mailguard 1.0.1 X-Spam-Status: No, hits=-2.402 tagged_above=-10 required=5 tests=AWL=0.197, BAYES_00=-2.599 X-Spam-Level: X-Archive-Number: 201002/1026 X-Sequence-Number: 157369 Fujii Masao wrote: > On Fri, Feb 12, 2010 at 10:10 PM, Heikki Linnakangas > wrote: >>> So I suggest that you have a new action that gets called after every >>> checkpoint to clear down the archive. It will remove all files from the >>> archive prior to %r. We can implement that as a sequence of unlink()s >>> from within the server, or we can just call a script to do it. I prefer >>> the latter approach. However we do it, we need something initiated by >>> the server to maintain the archive and stop it from overflowing. >> +1 > > If we leave executing the remove_command to the bgwriter, the restartpoint > might not happen unfortunately for a long time. Are you thinking of a scenario where remove_command gets stuck, and prevents bgwriter from performing restartpoints while it's stuck? You have trouble if restore_command gets stuck like that as well, so I think we can require that the remove_command returns in a reasonable period of time, ie. in a few minutes. > To prevent that situation, the > archiver should execute the command, I think. Thought? The archiver isn't running in standby, so that's not going to work. And it's not connected to shared memory either, so it doesn't know what the latest restartpoint is. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com