public inbox for [email protected]  
help / color / mirror / Atom feed
From: Markus Schiltknecht <[email protected]>
To: Emmanuel Cecchet <[email protected]>
Cc: [email protected]
Cc: Sequoia general mailing list <[email protected]>
Cc: Bruce Momjian <[email protected]>
Subject: Re: [Sequoia] PostgreSQL Documentation of High Availability and Load
Date: Wed, 22 Nov 2006 11:54:00 +0100
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>

Hello Emmanuel,

Emmanuel Cecchet wrote:
>>> Even here I think that there is a common misconception between 
>>> performance and scalability. Most people think that by having 
>>> multiple nodes their query will run faster which is obviously wrong 
>>> if your original workload does not saturate a single node. 
>>
>> Sure. Do you think that should be made clearer?
> Yes, I think so because this is a very common belief that we experience 
> with new users.

Okay, I have forwarded that to Bruce, who's editing the documentation 
(and is a native English speaker). I'm not sure how we can cover this, 
as we are very general in our description.

You might want to recheck the paragraph, which is now called 
"Synchronous Multi-Master Replication":

http://momjian.us/main/writings/pgsql/sgml/high-availability.html

I'm particularly unsure, where Sequoia would fit in. There is still the 
split between "Statement-Based Replication Middleware" and "Synchonous 
Multi-Master Replication".

Does Sequoia offer any form of async replication?

>>> The replication mechanisms are even adding overhead (usually 
>>> perceived as increased latency) to the query execution. It is ONLY 
>>> when the workload increases that you can see throughput going up 
>>> (ideally somewhat close to the workload increase) and query latency 
>>> remaining stable. Unless you really have a parallel query execution 
>>> (that is only efficient for big queries anyway), you will never see a 
>>> performance improvement on a single query execution since this is 
>>> always the same database engine that executes the query in the end.
>>
>> I don't quite agree with that statement, but probably I'm just 
>> misreading it. If you have enough concurrent transactions you can 
>> spread among the nodes, you'll certainly note an improvement. After 
>> all, it's a huge difference, if your single node is processing only 
>> ten or hundreds of concurrent transactions.
> Yes, but that already means that your single node was somewhat already a 
> bottleneck. My point was that for low workloads (note that low is 
> relative here since many users have dual-cpu machines with decent RAM 
> and disks, and it takes quite a number of concurrent transactions to get 
> to the peak point), you will not see any improvement and even you'll see 
> a slight degradation especially from a latency perspective. Below the 
> peak point of a single machine, you will get the same performance (from 
> a client point of view) but the load on the various machine resources 
> will decreased by the number of machines in the cluster (at best). For 
> example, if I have a workload of 50 requests/second that generates 50% 
> cpu load on 1 node, I will still get my 50 req/s with 2 machines but the 
> cpu load will only be 25% on each node.
> Now the contention can be elsewhere (disk, locks, ...) and exhibit other 
> scalability characteristics but it usually conforms to the model I 
> described.

Agreed.

>> Of course, the amount of concurrent transactions limits how far a 
>> replication solution can scale. Having more nodes than concurrent 
>> transactions does not make sense. (Of course with the exception of 
>> parallel query execution.)
> Yes but don't underestimate the capability of a single node to execute 
> transactions in parallel as well. Oftentimes sending 2 concurrent 
> transactions to a single node or to 2 different nodes does not make any 
> difference (obviously it depends on the nature of the transaction).

Okay, I just have to believe that. Up until now I'm mostly basing on 
theoretical estimates, rather than hard facts. :-)  You seem to have 
made some real benchmarks. Did you publish them?

>>> To summarize, clustering solutions provide performance scalability 
>>> (stable latency, throughput increasing almost linearly with load) but 
>>> not performance improvement on individual query execution time. 
>>
>> Yes, for writing transactions, no for read-only ones (queries?). Or 
>> why do you have to add overhead to read-only queries?
> In a middleware approach you have to proxy the read results as well so 
> you will add some latency there. When replication is integrated in the 
> database you can prevent this extra hop but still the replication logic 
> adds some overhead to any query (that seems inevitable if you want to 
> ensure consistency).

Ah, okay, yes, the extra hop through the proxy has to be added even to 
reading queries.

>> To make it work and production ready as soon as possible. ;-)  I'm 
>> currently working on initialization and recovery.
> Good luck, this is the hardest part ! You'll soon figure out that 
> replication was really the easy part !

Thanks.

May I ask what you use for automatic testing and benchmarking? I'm 
currently stuck testing 90% of the time. Starting up the GCS, two 
Databases and attaching the debugger to every process which could 
possibly go havoc really takes a F***ING lot of time!

Regards

Markus





view thread (25+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected]
  Subject: Re: [Sequoia] PostgreSQL Documentation of High Availability and Load
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox