Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1t95c5-007hzn-5N for pgsql-general@arkaria.postgresql.org; Thu, 07 Nov 2024 16:44:07 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1t95c2-00HOHl-Ea for pgsql-general@arkaria.postgresql.org; Thu, 07 Nov 2024 16:44:03 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1t95c2-00HOH3-0D for pgsql-general@lists.postgresql.org; Thu, 07 Nov 2024 16:44:02 +0000 Received: from mail-wm1-x32d.google.com ([2a00:1450:4864:20::32d]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1t95bx-000hgt-Rn for pgsql-general@postgresql.org; Thu, 07 Nov 2024 16:44:01 +0000 Received: by mail-wm1-x32d.google.com with SMTP id 5b1f17b1804b1-4314c452180so13573915e9.0 for ; Thu, 07 Nov 2024 08:43:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=leisi.net; s=google; t=1730997838; x=1731602638; darn=postgresql.org; h=to:date:message-id:subject:mime-version:from:from:to:cc:subject :date:message-id:reply-to; bh=kvC4jNW19v9FTIhKuIwMAxARf1mSSQi8NSEAQXGgU7w=; b=Zvs/GU6r58xT6RDy4hXeN3mRDjxuTDC1sNyqBGUTPyeAr7hxtrwt/r4vuYGfYtvJ1p nSwiaDv+e4qeSpv8zcWowr9kNnpc/wY33vZ8zZjSJadvqRLR4+/ogOYrPAuAgyV0FNXr wKtz+Hk50RFZIc6kdTJ/JSINei9h9wgCbrh8Q= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730997838; x=1731602638; h=to:date:message-id:subject:mime-version:from:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=kvC4jNW19v9FTIhKuIwMAxARf1mSSQi8NSEAQXGgU7w=; b=DXuu9MIQO7vO15buwX4tbVUp9RZhNH9LV6iDJkHbGhRO+2hSTzI9PzZVz0jQdV8aiI +LXLfxJXUrT90QIo/vOGULMc1L0SyM+F593LNKoKajbGel9+bqfbYA7agVsKDhcxHa+l my8jGB4TAt+5HK7d7JlprJu72sdJd6VV6ZKMfteRDJmig1hzA57JfFFg8GH4/3mbZLIP lKOOAQ8m7kch+iqdIdQJgMWO+aFQxrGccs9B08NmoC4qgG5VM4mTiCQZbBWuHsD8KkAQ fn3fIaH3+KMXWPZCC73hoXlRFiA+1LO75BjHlE4rkA64OdX5vCCmJnSfmKHbBJ7njtV/ hqGA== X-Gm-Message-State: AOJu0YyKbD1QpKyvcOVWoz0+EBa2xmIndUGiDz6FzKL16wdC92Ved/lb aIGjw8NGZY+O08Bw/jJo5TON70ZCODNotc856oLHH9cuNGOpJZLHvE9SKvrF3tyKh0YdlaLFQxc = X-Google-Smtp-Source: AGHT+IGqVK1Z7x0sCiYWgmNXPCpDyJHAvxVEnEsGlp/ZBzIGRAE3op2B2oyfcm0vYFUOeeiLQRlpgg== X-Received: by 2002:a05:600c:4ecf:b0:431:405a:f93b with SMTP id 5b1f17b1804b1-432b686ec50mr1846265e9.10.1730997837121; Thu, 07 Nov 2024 08:43:57 -0800 (PST) Received: from smtpclient.apple ([2a02:169:ce0e:0:6475:9219:63b5:475b]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-432b054b3fesm30133225e9.17.2024.11.07.08.43.56 for (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Thu, 07 Nov 2024 08:43:56 -0800 (PST) From: Matthias Leisi Content-Type: multipart/alternative; boundary="Apple-Mail=_F449998B-15C3-4047-8B89-06C3172F5D97" Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3774.600.62\)) Subject: Advice on cluster architecture for two related, but distinct, use cases Message-Id: <897B047D-B99D-4637-ACBD-E98EC3D38DD5@leisi.net> Date: Thu, 7 Nov 2024 17:43:46 +0100 To: pgsql-general@postgresql.org X-Mailer: Apple Mail (2.3774.600.62) List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --Apple-Mail=_F449998B-15C3-4047-8B89-06C3172F5D97 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 Dear all, (This is a follow-up to a question I asked almost exactly a year ago, = https://postgrespro.com/list/thread-id/2670756#726F3765-858C-4AC0-A7B0-5CB= 6720E4B37@leisi.net - the requirements have changed since then, and the = platform has changed from OpenBSD to Linux, which may make some things = easier.) I=E2=80=99m looking for advice on Postgres cluster architecture(s) for = two related but distinct use cases. Ideally, the approaches for the two = use cases would not differ too widely. The goal of clustering is low RPO (I guess we need sync clustering) and = RTO (ideally almost-instant failover); throughput is not a concern = (it=E2=80=99s relatively low transaction volume). Latency (due to the = distance between datacenters which provide georedundancy) is a fact we = are willing to accept (eg for syncs to happen on all cluster members). The first use case is in an environment under our own control (and where = eg a DBA could intervene). We can theoretically run any number of = cluster instances, but assume we would use an even number (split over = the two datacenters), or potentially an odd number of nodes (eg with an = arbiter). In the second use case, the environment is not under our control, so we = can only assume basic network connectivity from the application to the = DB, and between the DBs (the latter potentially through an SSH tunnel if = needed). In this use case, we can not assume a person to intervene if a = node goes down, and would prefer some automated failover to the other = node (this automation would also be welcome for the first use case, eg = if something happens while nobody is watching). We can not assume eg a = load balancer. There could be various ways how the environment in the second use case = is set up, ranging from =E2=80=9Eapplication and database running on the = same box=E2=80=9C (well, no clustering for you then=E2=80=A6), to = dedicated two- or three node cluster. In both use cases, we have full control over the application and the = database itself. =46rom reading various docs, it seems we would need something like = Patroni (/Percona), at least for the first use case. However it seems = relatively complex to set up and operate to me. I would appreciate your experience and input into which approach would = best fit the two use cases. We are also willing to engage in paid = consulting. Thanks, =E2=80=94 Matthias --=20 Matthias Leisi Katzenr=C3=BCtistrasse 68, 8153 R=C3=BCmlang Mobile +41 79 377 04 43 matthias@leisi.net --Apple-Mail=_F449998B-15C3-4047-8B89-06C3172F5D97 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8 Dear = all,

(This is a follow-up to a question I asked = almost exactly a year ago, https://postgrespro.com/list/thread-id/267075= 6#726F3765-858C-4AC0-A7B0-5CB6720E4B37@leisi.net - the = requirements have changed since then, and the platform has changed from = OpenBSD to Linux, which may make some things = easier.)


I=E2=80=99m looking for = advice on Postgres cluster architecture(s) for two related but distinct = use cases. Ideally, the approaches for the two use cases would not = differ too widely.

The goal of clustering is = low RPO (I guess we need sync clustering) and RTO (ideally = almost-instant failover); throughput is not a concern (it=E2=80=99s = relatively low transaction volume). Latency (due to the distance between = datacenters which provide georedundancy) is a fact we are willing to = accept (eg for syncs to happen on all cluster = members).


The first use case is = in an environment under our own control (and where eg a DBA could = intervene). We can theoretically run any number of cluster instances, = but assume we would use an even number (split over the two datacenters), = or potentially an odd number of nodes (eg with an = arbiter).


In the second use = case, the environment is not under our control, so we can only assume = basic network connectivity from the application to the DB, and between = the DBs (the latter potentially through an SSH tunnel if needed). In = this use case, we can not assume a person to intervene if a node goes = down, and would prefer some automated failover to the other node (this = automation would also be welcome for the first use case, eg if something = happens while nobody is watching). We can not assume eg a load = balancer.

There could be various ways how the = environment in the second use case is set up, ranging from = =E2=80=9Eapplication and database running on the same box=E2=80=9C = (well, no clustering for you then=E2=80=A6), to dedicated two- or three = node cluster.


In both use cases, = we have full control over the application and the database = itself.

=46rom reading various docs, it seems = we would need something like Patroni (/Percona), at least for the first = use case. However it seems relatively complex to set up and operate to = me.

I would appreciate your experience and = input into which approach would best fit the two use cases. We are also = willing to engage in paid = consulting.

Thanks,
=E2=80=94 = Matthias

-- 
Matthias = Leisi
Katzenr=C3=BCtistrasse 68, 8153 R=C3=BCmlang
Mobile +41 79 = 377 04 43
matthias@leisi.net



= --Apple-Mail=_F449998B-15C3-4047-8B89-06C3172F5D97--