Re: [HACKERS] Replication documentation addition

public inbox for [email protected]  
help / color / mirror / Atom feed

From: Markus Schiltknecht <[email protected]>
To: Bruce Momjian <[email protected]>
Cc: Hannu Krosing <[email protected]>
Cc: PostgreSQL-documentation <[email protected]>
Cc: PostgreSQL-development <[email protected]>
Subject: Re: [HACKERS] Replication documentation addition
Date: Wed, 25 Oct 2006 11:38:11 +0200
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>

Hi,

Bruce Momjian wrote:
> I have updated the text.  Please let me know what else I should change. 
> I am unsure if I should be mentioning commercial PostgreSQL products in
> our documentation.

I support your POV and vote for not including any pointers to commercial 
extensions in the official documentation. If at all, they should go to 
'external-projects.sgml', where PostGIS, PgAdmin and other projects are 
mentioned.

I can't really get excited about the exclusion of the term 
'replication', because it's what most people are looking for. It's a 
well known term. Sorry if it sounded that way, but I've not meant to 
avoid that term.

The newly created terms 'Query Broadcast Load Balancing' or even worse 
'Multi-Master Load Balancing' are more confusing than helpful, because 
these terms do not exist. (See the googlefight in [1])

Can we name the chapter "Fail-over, Load-Balancing and Replication 
Options"? That would fit everything and contain the necessary buzz words.

Also, I'm still missing Multi- vs Single-Master, which are also commonly 
used terms.

IMHO, it does not make sense to speak of a synchronous replication for a 
'Shared Disk Fail Over'. It's not replication, because there's no replica.

The Data Partitioning paragraph should probably mention it's close 
relation with data partitioning across table spaces (and make the 
differences clear).

What you call 'Query Broadcast Load Balancing' is also a multi-master 
replication, thus naming only the later 'Multi-Master Load Balancing' 
misleading.

I'd propose to add a subsection 'Synchronous, Multi-Master Replication' 
and explain the different possibilities on how to do that:

* Query-Based
* with 2PC
* Distributed SHMEM
* (perhaps mention the optimized Postgres-R algorithm ;-)

What you called 'Single-Query Clustering' is probably better known as 
'Parallel Query Execution'. It can be combined with all types of 
replication (every combination of async / sync and Single- / 
Multi-Master). It's maybe load balancing, but it depends on some form of 
replication to distribute the data first.

I liked Chris Browns documentation in [2] which was clearer regarding 
replication (which can be used to do fail-over, load-balancing, 
data-partitioning or parallel query execution). I'd like to keep all 
those things a little more separate to get them clear.

Regards

Markus

[1]: Googlefight: "Multi-Master Load Balancing" vs "Multi-Master 
Replication": http://tinyurl.com/y3k76r

[2]: Chris Browns proposal for a replication documentation:
http://archives.postgresql.org/pgsql-patches/2006-08/msg00026.php

view thread (117+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected]
  Subject: Re: [HACKERS] Replication documentation addition
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox