Received: from maia.hub.org (unknown [200.46.204.183]) by mail.postgresql.org (Postfix) with ESMTP id E72256346EC for ; Wed, 27 Jan 2010 11:28:12 -0400 (AST) Received: from mail.postgresql.org ([200.46.204.86]) by maia.hub.org (mx1.hub.org [200.46.204.183]) (amavisd-maia, port 10024) with ESMTP id 97342-09-2 for ; Wed, 27 Jan 2010 15:28:02 +0000 (UTC) Received: from cvs.postgresql.org (cvs.postgresql.org [217.196.146.206]) by mail.postgresql.org (Postfix) with ESMTP id D118C633CD5 for ; Wed, 27 Jan 2010 11:27:52 -0400 (AST) Received: by cvs.postgresql.org (Postfix, from userid 1198) id 3B2047541B9; Wed, 27 Jan 2010 15:27:51 +0000 (UTC) MIME-Version: 1.0 To: pgsql-committers@postgresql.org Subject: pgsql: Make standby server continuously retry restoring the next WAL X-Mailer: activitymail 1.24, http://search.cpan.org/dist/activitymail/ Content-Type: text/plain Message-Id: <20100127152751.3B2047541B9@cvs.postgresql.org> Date: Wed, 27 Jan 2010 15:27:51 +0000 (UTC) From: heikki@postgresql.org (Heikki Linnakangas) X-Virus-Scanned: Maia Mailguard 1.0.1 X-Spam-Status: No, hits=-2.599 tagged_above=-10 required=5 tests=BAYES_00=-2.599 X-Spam-Level: X-Archive-Number: 201001/396 X-Sequence-Number: 39512 Log Message: ----------- Make standby server continuously retry restoring the next WAL segment with restore_command, if the connection to the primary server is lost. This ensures that the standby can recover automatically, if the connection is lost for a long time and standby falls behind so much that the required WAL segments have been archived and deleted in the master. This also makes standby_mode useful without streaming replication; the server will keep retrying restore_command every few seconds until the trigger file is found. That's the same basic functionality pg_standby offers, but without the bells and whistles. To implement that, refactor the ReadRecord/FetchRecord functions. The FetchRecord() function introduced in the original streaming replication patch is removed, and all the retry logic is now in a new function called XLogReadPage(). XLogReadPage() is now responsible for executing restore_command, launching walreceiver, and waiting for new WAL to arrive from primary, as required. This also changes the life cycle of walreceiver. When launched, it now only tries to connect to the master once, and exits if the connection fails, or is lost during streaming for any reason. The startup process detects the death, and re-launches walreceiver if necessary. Modified Files: -------------- pgsql/src/backend/access/transam: xlog.c (r1.361 -> r1.362) (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/access/transam/xlog.c?r1=1.361&r2=1.362) pgsql/src/backend/postmaster: postmaster.c (r1.601 -> r1.602) (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/postmaster/postmaster.c?r1=1.601&r2=1.602) pgsql/src/backend/replication: walreceiver.c (r1.1 -> r1.2) (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/replication/walreceiver.c?r1=1.1&r2=1.2) walreceiverfuncs.c (r1.2 -> r1.3) (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/replication/walreceiverfuncs.c?r1=1.2&r2=1.3) pgsql/src/include/replication: walreceiver.h (r1.4 -> r1.5) (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/include/replication/walreceiver.h?r1=1.4&r2=1.5) pgsql/src/include/storage: pmsignal.h (r1.28 -> r1.29) (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/include/storage/pmsignal.h?r1=1.28&r2=1.29)