Received: from maia.hub.org (unknown [200.46.204.183]) by mail.postgresql.org (Postfix) with ESMTP id 01FFC633DE3; Thu, 25 Mar 2010 07:15:34 -0300 (ADT) Received: from mail.postgresql.org ([200.46.204.86]) by maia.hub.org (mx1.hub.org [200.46.204.183]) (amavisd-maia, port 10024) with ESMTP id 72800-07; Thu, 25 Mar 2010 10:15:23 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from exprod7og126.obsmtp.com (exprod7og126.obsmtp.com [64.18.2.206]) by mail.postgresql.org (Postfix) with SMTP id 2EE00632E15; Thu, 25 Mar 2010 07:15:22 -0300 (ADT) Received: from source ([209.85.219.225]) by exprod7ob126.postini.com ([64.18.6.12]) with SMTP ID DSNKS6s3uMU7eWNFmw/ixGCo6OOWeFiYtezw@postini.com; Thu, 25 Mar 2010 03:15:22 PDT Received: by ewy25 with SMTP id 25so3473498ewy.7 for ; Thu, 25 Mar 2010 03:15:19 -0700 (PDT) Received: by 10.213.103.84 with SMTP id j20mr2769785ebo.3.1269512119642; Thu, 25 Mar 2010 03:15:19 -0700 (PDT) Received: from [192.168.1.117] (dsl-hkibrasgw2-ff67c300-165.dhcp.inet.fi [88.195.103.165]) by mx.google.com with ESMTPS id 14sm677610ewy.10.2010.03.25.03.15.16 (version=TLSv1/SSLv3 cipher=RC4-MD5); Thu, 25 Mar 2010 03:15:18 -0700 (PDT) Message-ID: <4BAB37B3.109@enterprisedb.com> Date: Thu, 25 Mar 2010 12:15:15 +0200 From: Heikki Linnakangas Organization: EnterpriseDB User-Agent: Mozilla-Thunderbird 2.0.0.22 (X11/20090706) MIME-Version: 1.0 To: Simon Riggs CC: Fujii Masao , Aidan Van Dyk , PostgreSQL-development , PostgreSQL Docs Subject: Re: [HACKERS] Re: [COMMITTERS] pgsql: Make standby server continuously retry restoring the next WAL References: <3f0b79eb1002092105r21e009d3v468496058ba04392@mail.gmail.com> <4B743E7D.5070603@enterprisedb.com> <3f0b79eb1002180337t1fab1395ve3491256672af15f@mail.gmail.com> <4BA0B079.3050301@enterprisedb.com> <3f0b79eb1003180727g7877743eq81274e014fe70a49@mail.gmail.com> <1268988724.3556.3.camel@ebony> <4BA361E4.7020309@enterprisedb.com> <3f0b79eb1003230017v16f4ecbeyc20e75beeffe8f1c@mail.gmail.com> <4BAA060A.2020000@enterprisedb.com> <1269472981.8481.8946.camel@ebony> <3f0b79eb1003241908n1e8f38e0q7cd7465163b3d7af@mail.gmail.com> <1269504491.8481.8965.camel@ebony> In-Reply-To: <1269504491.8481.8965.camel@ebony> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Virus-Scanned: Maia Mailguard 1.0.1 X-Spam-Status: No, hits=0.001 tagged_above=-10 required=5 tests=BAYES_50=0.001 X-Spam-Level: X-Archive-Number: 201003/15 X-Sequence-Number: 5380 (cc'ing docs list) Simon Riggs wrote: > The lack of docs begins to show a lack of coherent high-level design > here. Yeah, I think you're right. It's becoming hard to keep track of how it's supposed to behave. > By now, I've forgotten what this thread was even about. The major > design decision in this that keeps showing up is "remove pg_standby, at > all costs" but no reason has ever been given for that. I do believe > there is a "better way", but we won't find it by trial and error, even > if we had time to do so. This has nothing to do with pg_standby. > Please work on some clear docs for the failure modes in this system. > That way we can all read them and understand them, or point out further > issues. Moving straight to code is not a solution to this, since what we > need now is to all agree on the way forwards. If we ignore this, then > there is considerable risk that streaming rep will have a fatal > operational flaw. > > Please just document/diagram how it works now, highlighting the problems > that still remain to be solved. We're all behind you and I'm helping > wherever I can. Ok, here's my attempt at the docs. Read it as a replacement for the "High Availability, Load Balancing, and Replication" chapter, but of course many of the sections will be unchanged, as indicated below. ------------- Chapter 25. High Availability, Load Balancing, and Replication 25.1 Comparison of different solutions 25.2 Log-Shipping Standby servers A standby server can also be used for read-only queries. This is called Hot Standby mode, see chapter XXX 25.2.1 Planning Set up two servers with identical hardware ... 25.2.3 Standby mode In standby mode, the server continously applies WAL received from the master server. The standby server can receive WAL from a WAL archive (see restore_command) or directly from the master over a TCP connection (streaming replication). The standby server will also attempt to restore any WAL found in the standby's pg_xlog. That typically happens after a server restart, to replay again any WAL streamed from the master before the restart, but you can also manually copy files to pg_xlog at any time to have them replayed. At startup, the standby begins by restoring all WAL available in the archive location, calling restore_command. Once it reaches the end of WAL available there and restore_command fails, it tries to restore any WAL available in the pg_xlog directory (possibly stored there by streaming replication before restart). If that fails, and streaming replication has been configured, the standby tries to connect to the master server and stream WAL from it. If that fails or streaming replication is not configured, or if the connection is disconnected later on, the standby goes back to step 1 and tries to restoring the file from the archive again. This loop of retries from the archive, pg_xlog, and via streaming replication goes on until the server is stopped or failover is triggered by a trigger file. A corrupt or half-finished WAL file in the archive, or streamed from the master, causes a PANIC and immediate shutdown of the standby server. A corrupt WAL file is always a serious event which requires administrator action. If you want to recover a WAL file known to be corrupt as far as it can be, you can copy the file manually into pg_xlog. Standby mode is exited and the server switches to normal operation, when a trigger file is found (trigger_file). Before failover, it will restore any WAL available in the archive or in pg_xlog, but won't try to connect to the master or wait for files to become available in the archive. 25.2.4 Preparing Master for Standby servers Set up continous archiving to a WAL archive on the master, as described in the chapter "Continous Archiving and Point-In-Time_recovery". The archive location should be accessible from the standby even when the master is down, ie. it should reside on the standby server itself or another trusted server, not on the master server. If you want to use streaming replication, set up authentication to allow streaming replication connections. Set max_wal_senders. Take a base backup as described in chapter Continous Archiving and Point-In-Time_recovery / Making a Base Backup. 25.2.4.1 Authentication for streaming replication Ensure that listen_addresses allows connections from the standby server. 25.2.5 Setting up the standby server 1. Take a base backup, and copy it to the standby 2. Create a restore_command to restore files from the WAL archive. 3. Set standby_mode=on 4. If you want to use streaming replicaton, set primary_conninfo You can use restartpoint_command to prune the archive of files no longer needed by the standby. You can have any number of standby servers as long as you set max_wal_senders high enough in the master to allow them to be connected simultaneously. 25.2.6 Archive recovery based log shipping An alternative to the built-in standby mode desribed in the previous sections is to use a restore_command that polls the archive location. This was the only option available in versions 8.4 and below. In this setup, set standby_mode=off, because you are implementing the polling required for a standby server yourself. See contrib/pg_standby for a reference implementation of this. Note that the in this mode, the server will apply WAL one file at a time, so if you use the standby server for queries (see Hot Standby), there is a bigger delay between an action in the master and when the action becomes visible in the standby, corresponding the time it takes to fill up the WAL file. archive_timeout can be used to make that delay shorter. Also note that you can't combine streaming replication with this method. 25.3 Hot Standby 25.4 Incrementally Updated backups ------------- -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com