Re: Quesion about querying distributed databases

public inbox for [email protected]  
help / color / mirror / Atom feed

From: Greg Sabino Mullane <[email protected]>
To: me nefcanto <[email protected]>
Cc: Laurenz Albe <[email protected]>
Cc: Adrian Klaver <[email protected]>
Cc: [email protected]
Subject: Re: Quesion about querying distributed databases
Date: Wed, 5 Mar 2025 10:21:27 -0500
Message-ID: <CAKAnmmLyu_tub6_OVhfFJeypxT4wAtVPs8x6kQfO6YPvmNDsaA@mail.gmail.com> (raw)
In-Reply-To: <CAEHBEODw8svX557pjB_EL-Os7KWtwi-9Uq=RuCkRKgHVZWw8Bw@mail.gmail.com>
References: <CAEHBEOBuoMFWuhHM3L_Zr6o1enELju-Vns6Pknt4TT+6MFQOwQ@mail.gmail.com>
	<[email protected]>
	<CAEHBEOD969YrbPH_z9OEmThWx3-w4sMMaHLhZLOQwqCwE8Y58Q@mail.gmail.com>
	<[email protected]>
	<CAEHBEOBXzkGTqxQSYqmEFN5hbc=zsGWFpU9h8zf7AAPv4VdOWQ@mail.gmail.com>
	<[email protected]>
	<CAEHBEOBNoG8RkKuCcQQWkbYppMLMzA0MXq+s0kZ6wKWgD7+45Q@mail.gmail.com>
	<[email protected]>
	<CAEHBEODw8svX557pjB_EL-Os7KWtwi-9Uq=RuCkRKgHVZWw8Bw@mail.gmail.com>

On Wed, Mar 5, 2025 at 7:15 AM me nefcanto <[email protected]> wrote:

> I think if we put all databases into one database, then we have blocked
> our growth in the future.
>

I think this is premature optimization. Your products table has 100,000
rows. That's very tiny for the year 2025. Try putting everything on one box
with good indexes and you might be surprised at the performance.

> A monolith database can be scaled only vertically.
>

Postgres scales well vertically. Plus, you can have streaming replicas to
distribute the read queries (like the one given here) across many machines.

> We have had huge headaches in the past with SQL Server on Windows and a
> single database.
> But when you divide bounded contexts into different databases, then you
> have the chance to deploy each database on a separate physical machine.
> That means a lot in terms of performance.
>

I get your concern, but if the data is inter-related, it really is best to
have them on the same server (and same database, and same schema). Then
Postgres can devise a really efficient plan. You can also use Citus to
start sharding things across multiple physical servers if your database
gets very large.

Let's put this physical restriction on ourselves that we have different
> databases. What options do we have?
>

Your main option is FDW, which will never perform as well as a single
server. Plus, you have the additional headache of trying to coordinate data
updates atomically across different servers. The other option is to have
the application do the work, e.g. pull a list of things from one server,
use that to build a query against another one, etc. Definitely not ideal.

Cheers,
Greg

--
Crunchy Data - https://www.crunchydata.com
Enterprise Postgres Software Products & Tech Support

view thread (19+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Quesion about querying distributed databases
  In-Reply-To: <CAKAnmmLyu_tub6_OVhfFJeypxT4wAtVPs8x6kQfO6YPvmNDsaA@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox