Received: from maia.hub.org (unknown [200.46.204.183]) by mail.postgresql.org (Postfix) with ESMTP id 4FE20632E37 for ; Wed, 10 Feb 2010 01:06:06 -0400 (AST) Received: from mail.postgresql.org ([200.46.204.86]) by maia.hub.org (mx1.hub.org [200.46.204.183]) (amavisd-maia, port 10024) with ESMTP id 44227-02 for ; Wed, 10 Feb 2010 05:05:55 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mail-yw0-f183.google.com (mail-yw0-f183.google.com [209.85.211.183]) by mail.postgresql.org (Postfix) with ESMTP id DF76C63265C for ; Wed, 10 Feb 2010 01:05:55 -0400 (AST) Received: by ywh13 with SMTP id 13so2804257ywh.20 for ; Tue, 09 Feb 2010 21:05:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:cc:content-type; bh=SNl24Nys5djNhw0unio5hAVOsn4UGmWths9uZgxinM0=; b=JYsRLaKkwtH8h4piqw1oLnOPSRIidhGx9uRRi7ix1e3R29wJ4gdf0n4mooVzHMzvt2 fK8AIhBHYAlnpbDe5wC4+PF/PiCjV+Bz2YMCQzSTaYqwSAqTZGfqh3jQbY1AOL4xhUky wGKwcb23w5Ebh8sQ0PYpVtD5ZxbHjb08Ttho8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=r0Eh/ZnsYDVHbT2AFAU8HDC7xSRGD0n5dJGjA4Y4tNYS3W2skGwohX7g5pSOhnBESe qyTCtBNwKqNfS0sfrlH2uuZnqOEA+eJukmOJY8u2SdlA3N8TOSX8rDODcD6P/CMiSyqP CaF4Xbo4qm+9bZtVyRvBmXttOm2tnB+YRBtX4= MIME-Version: 1.0 Received: by 10.101.159.31 with SMTP id l31mr1982914ano.80.1265778355218; Tue, 09 Feb 2010 21:05:55 -0800 (PST) In-Reply-To: <20100127152751.3B2047541B9@cvs.postgresql.org> References: <20100127152751.3B2047541B9@cvs.postgresql.org> Date: Wed, 10 Feb 2010 14:05:55 +0900 Message-ID: <3f0b79eb1002092105r21e009d3v468496058ba04392@mail.gmail.com> Subject: Re: [COMMITTERS] pgsql: Make standby server continuously retry restoring the next WAL From: Fujii Masao To: Heikki Linnakangas Cc: PostgreSQL-development Content-Type: text/plain; charset=ISO-8859-1 X-Virus-Scanned: Maia Mailguard 1.0.1 X-Spam-Status: No, hits=-2.599 tagged_above=-10 required=5 tests=BAYES_00=-2.599 X-Spam-Level: X-Archive-Number: 201002/713 X-Sequence-Number: 157056 On Thu, Jan 28, 2010 at 12:27 AM, Heikki Linnakangas wrote: > Log Message: > ----------- > Make standby server continuously retry restoring the next WAL segment with > restore_command, if the connection to the primary server is lost. This > ensures that the standby can recover automatically, if the connection is > lost for a long time and standby falls behind so much that the required > WAL segments have been archived and deleted in the master. > > This also makes standby_mode useful without streaming replication; the > server will keep retrying restore_command every few seconds until the > trigger file is found. That's the same basic functionality pg_standby > offers, but without the bells and whistles. http://archives.postgresql.org/pgsql-hackers/2010-01/msg01520.php http://archives.postgresql.org/pgsql-hackers/2010-01/msg02589.php As I pointed out previously, the standby might restore a partially-filled WAL file that is being archived by the primary, and cause a FATAL error. And this happened in my box when I was testing the SR. sby [20088] FATAL: archive file "000000010000000000000087" has wrong size: 14139392 instead of 16777216 sby [20076] LOG: startup process (PID 20088) exited with exit code 1 sby [20076] LOG: terminating any other active server processes act [18164] LOG: received immediate shutdown request If the startup process is in standby mode, I think that it should retry starting replication instead of emitting an error when it finds a partially-filled file in the archive. Then if the replication has been terminated, it has only to restore the archived file again. Thought? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center