Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1t9CFp-008LLA-V8 for pgsql-general@arkaria.postgresql.org; Thu, 07 Nov 2024 23:49:33 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1t9CFn-003vkG-EE for pgsql-general@arkaria.postgresql.org; Thu, 07 Nov 2024 23:49:32 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1t9CFm-003vk7-Uf for pgsql-general@lists.postgresql.org; Thu, 07 Nov 2024 23:49:31 +0000 Received: from mail-pj1-x102e.google.com ([2607:f8b0:4864:20::102e]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1t9CFk-000i8b-6I for pgsql-general@lists.postgresql.org; Thu, 07 Nov 2024 23:49:30 +0000 Received: by mail-pj1-x102e.google.com with SMTP id 98e67ed59e1d1-2e2e88cb0bbso1196514a91.3 for ; Thu, 07 Nov 2024 15:49:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1731023367; x=1731628167; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=T/3BgV4dzqdDuQY7eZmaNK5HoyziQPLr5OHZ0GVcu7A=; b=e9GWBgadZXIoDrxkNI7/RbOvm4QoPL6vXDhXeKYaaoVuGifU4hgPtyxKTxdq9TgGVo ES+dyq0HpUAW498L0+6lZ9wkBLKhGkRED6gllQ8eWjRy7QQowwFjpLIN82ZBJvIpr7Lo 3K6eTdr+4o7efPNK97+QytFOpweQgv/QDdKONQ+UqFOvs6bIiL6gGiGou1Tj9mXVAyOV UjKBUTvBxjhIRIypTjFQBVEYGKFT2az6z7Rxg/HupJofmgaYZmlNBF2Z7ZLFuPgkQl8w 6yfyIIopYp1mxfpTSdrvmfE17NlOUwbs+8MmaKaPGkZqEjDLw1+aFpmn0XtymX2cFDgY jGUA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731023367; x=1731628167; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=T/3BgV4dzqdDuQY7eZmaNK5HoyziQPLr5OHZ0GVcu7A=; b=PxPGTneUx+pxDBKxn8qyLnZ+c4bfGijwmDLz80GcoPNDUe0+VqSjuz+kLIHat+sLcj 3RMaWPJu2rIE80qSLzknZ53UwLO90S8rp5TKop3JLQPb5pCv6yJEutqep9oL6T1G/Cs/ llEbdt0GxFiXoJTM/WXoQHh99qd19yHeM5VOgP9IZrpGUNjkmSHnWrj501gA7BdS1X4v 2nEEciKPihn+Fu/kW779g1Eu4lclHtm73hYSds48eiDw/x1H8JIcaxMfSj56y8y/Qa1D t86fSb59MFCuTrHXdZmZgL8ZYhEw6wElqxGFmB/LgLr+m58xwcdwHUBoPc83weqtbEOp yuPQ== X-Gm-Message-State: AOJu0Yyzc9i2Ot+r66xje/FUDVrNk+a8PCjk/TQ5M99jG9o93kdGV2hm JY7oM+30QmNl3cbj6f4ngzd7jhw/5AjvJ5m1vcu1MMNn0MAJG0M6OJSZg4egbKkF6KkEnzs7xDZ MiwwPV+rGummfKsGN1Nk1nzNYTqJAWA== X-Google-Smtp-Source: AGHT+IGZjRZlS6dTS6x8+RxVGIGtRDzH3iKbjUYE7PLYiw7Cn/4/3ZLI85ss96X0Ah6KK6//x38ZxuXUASM+khNOsvs= X-Received: by 2002:a17:90b:38ce:b0:2cb:4e14:fd5d with SMTP id 98e67ed59e1d1-2e9b173c84cmr1445651a91.17.1731023366952; Thu, 07 Nov 2024 15:49:26 -0800 (PST) MIME-Version: 1.0 References: <18DEA0D4-1D33-48CA-A037-FF01AFE37471@leisi.net> In-Reply-To: <18DEA0D4-1D33-48CA-A037-FF01AFE37471@leisi.net> From: Zahid Rahman Date: Thu, 7 Nov 2024 23:49:16 +0000 Message-ID: Subject: Re: Advice on cluster architecture for two related, but distinct, use cases To: Matthias Leisi Cc: pgsql-general@lists.postgresql.org Content-Type: multipart/alternative; boundary="000000000000e2a0d006265b4855" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000e2a0d006265b4855 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Perhaps a 14 minute investment in this article may prove fruitful. https://medium.com/@martin.hodges/adding-a-postgres-high-availability-datab= ase-to-your-kubernetes-cluster-634ea5d6e4a1 On Thu, 7 Nov 2024, 21:06 Matthias Leisi, wrote: > Dear all, > > (This is a follow-up to a question I asked almost exactly a year ago, > https://postgrespro.com/list/thread-id/2670756#726F3765-858C-4AC0-A7B0-5C= B6720E4B37@leisi.net - > the requirements have changed since then, and the platform has changed fr= om > OpenBSD to Linux, which may make some things easier.) > > > I=E2=80=99m looking for advice on Postgres cluster architecture(s) for tw= o related > but distinct use cases. Ideally, the approaches for the two use cases wou= ld > not differ too widely. > > The goal of clustering is low RPO (I guess we need sync clustering) and > RTO (ideally almost-instant failover, but a failover process of up to a > minute in the worst case could be acceptable); throughput is not a concer= n > (it=E2=80=99s relatively low transaction volume except for some often-wri= tten > statistics data, which is still moderate). Latency (due to the distance > between datacenters for georedundancy) is a fact we are willing to accept= . > > > The first use case is in an environment under our own control (and where > eg a DBA could intervene). We can theoretically run any number of cluster > instances, but assume we would use an even number (split over the two > datacenters), or potentially an odd number of nodes (eg with an arbiter). > We could use a load balancer, but I guess this would strongly deviate fro= m > the second use case: > > > In the second use case, the environment is not under our control, so we > can only assume basic network connectivity from the application to the DB= , > and between the DBs (the latter potentially through an SSH tunnel if > needed). In this use case, we can not assume a person to intervene if a > node goes down, and would prefer some automated failover to the other nod= e > (this automation would also be welcome for the first use case, eg if > something happens while nobody is watching). We can not assume eg a load > balancer. > > There could be various ways how the environment in the second use case is > set up, ranging from =E2=80=9Eapplication and database running on the sam= e box=E2=80=9C > (well, no clustering for you then=E2=80=A6), to dedicated two- or three n= ode > database cluster serving a number of application machines. > > > In both use cases, we have full control over the application and the > database code and environment. > > From reading various docs, it seems we would need something like Patroni > (/Percona), at least for the first use case. However it seems relatively > complex to set up and operate. > > I would appreciate your experience and input into which approach would > best fit the two use cases. We are also willing to engage in paid > consulting. > > Thanks, > =E2=80=94 Matthias > > -- > Matthias Leisi > Katzenr=C3=BCtistrasse 68, 8153 R=C3=BCmlang > > Mobile +41 79 377 04 43 > matthias@leisi.net > --000000000000e2a0d006265b4855 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Perhaps a 14 minute investment in this article may p= rove fruitful.

On Thu, 7 Nov 2024, 21:06 Matt= hias Leisi, <matthias@leisi.net> wrote:
Dear all,

(This is a follow-up to = a question I asked almost exactly a year ago,=C2=A0https://postgrespro.com/list/th= read-id/2670756#726F3765-858C-4AC0-A7B0-5CB6720E4B37@leisi.net=C2=A0- t= he requirements have changed since then, and the platform has changed from = OpenBSD to Linux, which may make some things easier.)

<= div>
I=E2=80=99m looking for advice on Postgres cluster archi= tecture(s) for two related but distinct use cases. Ideally, the approaches = for the two use cases would not differ too widely.

The goal of clustering is low RPO (I guess we need sync clustering) and RT= O (ideally almost-instant failover, but a failover process of up to a minut= e in the worst case could be acceptable); throughput is not a concern (it= =E2=80=99s relatively low transaction volume except for some often-written = statistics data, which is still moderate). Latency (due to the distance bet= ween datacenters for georedundancy) is a fact we are willing to accept.


The first use case is in an environmen= t under our own control (and where eg a DBA could intervene). We can theore= tically run any number of cluster instances, but assume we would use an eve= n number (split over the two datacenters), or potentially an odd number of = nodes (eg with an arbiter). We could use a load balancer, but I guess this = would strongly deviate from the second use case:

<= br>
In the second use case, the environment is not under our cont= rol, so we can only assume basic network connectivity from the application = to the DB, and between the DBs (the latter potentially through an SSH tunne= l if needed). In this use case, we can not assume a person to intervene if = a node goes down, and would prefer some automated failover to the other nod= e (this automation would also be welcome for the first use case, eg if some= thing happens while nobody is watching). We can not assume eg a load balanc= er.

There could be various ways how the environmen= t in the second use case is set up, ranging from =E2=80=9Eapplication and d= atabase running on the same box=E2=80=9C (well, no clustering for you then= =E2=80=A6), to dedicated two- or three node database cluster serving a numb= er of application machines.


In both= use cases, we have full control over the application and the database code= and environment.

From reading various docs, it se= ems we would need something like Patroni (/Percona), at least for the first= use case. However it seems relatively complex to set up and operate.
=

I would appreciate your experience and input into which= approach would best fit the two use cases. We are also willing to engage i= n paid consulting.

Thanks,
=E2=80=94 Mat= thias

--=C2=A0
--000000000000e2a0d006265b4855--