Received: from localhost (uranus.hub.org [200.46.204.60]) by postgresql.org (Postfix) with ESMTP id E9FB29FB27F for ; Thu, 2 Nov 2006 18:04:10 -0400 (AST) Received: from postgresql.org ([200.46.204.71]) by localhost (mx1.hub.org [200.46.204.60]) (amavisd-new, port 10024) with ESMTP id 19484-02 for ; Thu, 2 Nov 2006 18:04:08 -0400 (AST) X-Greylist: from auto-whitelisted by SQLgrey- Received: from floppy.pyrenet.fr (news.pyrenet.fr [194.116.145.2]) by postgresql.org (Postfix) with ESMTP id E0ACE9FA646 for ; Thu, 2 Nov 2006 18:04:07 -0400 (AST) Received: by floppy.pyrenet.fr (Postfix, from userid 106) id 7822A308F7; Thu, 2 Nov 2006 23:04:06 +0100 (MET) From: Chris Browne X-Newsgroups: pgsql.docs Subject: Re: [HACKERS] Replication documentation addition Date: Mon, 30 Oct 2006 12:23:18 -0500 Organization: cbbrowne Computing Inc Lines: 102 Message-ID: <60k62h4t15.fsf@dba2.int.libertyrms.com> References: <200610261553.k9QFr9V23851@momjian.us> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Complaints-To: usenet@news.hub.org User-Agent: Gnus/5.1007 (Gnus v5.10.7) XEmacs/21.4.19 (linux) Cancel-Lock: sha1:aAEpJ1hYHDtGD1cuzcWNo/gWh10= To: pgsql-docs@postgresql.org X-Virus-Scanned: Maia Mailguard 1.0.1 X-Spam-Status: No, hits=0.914 tagged_above=0 required=5 tests=BAYES_40, DATE_IN_PAST_48_96, SARE_SPEC_REPLICA X-Spam-Level: X-Archive-Number: 200611/1 X-Sequence-Number: 3835 bruce@momjian.us (Bruce Momjian) writes: > With no new additions submitted today, I have moved my text into our > SGML documentation: > > http://momjian.us/main/writings/pgsql/sgml/failover.html > > Please let me know what additional changes are needed. It's looking a lot improved to me... There are still numerous places where it needs s/Slony/Slony-I/g because there is more than one thing out there called "Slony," only one of which is the single-master-to-multiple-subscribers-asynchronous replication system... "This can be complex to set up because functions like random() and CURRENT_TIMESTAMP will have different values on different servers, and sequences should be consistent across servers." It doesn't make sense to call this "complex to set up." This problem isn't about complexity of setup; it is about whether updates are processed identically on different hosts. Perhaps better: "Query broadcasting can break down such that servers fall out of sync if the queries have nondeterministic behavior. For instance, functions like random(), CURRENT_TIMESTAMP, and nextval('some_sequence') will take on different values on different servers. Care must be taken at the application level to make sure that queries are all fully deterministic and that they either COMMIT or ABORT on all servers." "24.6. Clustering For Load Balancing In clustering, each server can accept write requests, and these write requests are broadcast from the original server to all other servers before each transaction commits. Under heavy load, this can cause excessive locking and performance degradation. It is implemented by Oracle in their RAC product. PostgreSQL does not offer this type of load balancing, though PostgreSQL two-phase commit can be used to implement this in application code or middleware." Something doesn't feel entirely right here... How about... "24.6. Multimaster Replication For Load Balancing In this scenario, each server can accept write requests, which are broadcast from the original server to all other servers before each transaction commits in order to ensure consistency. Unfortunately, under heavy load, the cost of distributing locks across servers can lead to substantial performance degradation. It is implemented by Oracle in their RAC product. PostgreSQL does not offer this type of load balancing, though PostgreSQL two-phase commit using and may be used to implement this in application code or middleware. The communications costs involved in distributing locks and writes have the result that write operations are considerably more expensive than they would be on a single server. In general, the cost of distributed locking means that this clustering approach is only usable across a cluster of servers at a local site. There will only be a performance "win" if the cluster mostly processes read-only traffic that the cluster can distribute across a larger number of database servers. Write performance generally degrades a fair bit as compared to using a single database server. Reliability should be enhanced since the cluster should be able to continue work even if some of the members of the cluster should fail." "24.7. Clustering For Parallel Query Execution This allows multiple servers to work on a single query. One possible way this could work is for the data to be split among servers and for each server to execute its part of the query and results sent to a central server to be combined and returned to the user. There currently is no PostgreSQL open source solution for this." This seems a bit thin. "24.7. Clustering For Parallel Query Execution This allows multiple servers to work concurrently on a single query, analagous to the way RAID permits multiple disk drives to respond concurrently to disk I/O requests. One way this could work is for the data to be partitioned across the servers, where each server executes its part of the query, submitting results to a central server to be combined and returned to the user. There currently is no PostgreSQL open source solution for this." -- select 'cbbrowne' || '@' || 'acm.org'; http://cbbrowne.com/info/advocacy.html Why do we put suits in a garment bag, and put garments in a suitcase?