Re: Quesion about querying distributed databases

public inbox for [email protected]  
help / color / mirror / Atom feed

From: me nefcanto <[email protected]>
To: Laurenz Albe <[email protected]>
Cc: Adrian Klaver <[email protected]>
Cc: [email protected]
Subject: Re: Quesion about querying distributed databases
Date: Wed, 5 Mar 2025 12:57:37 +0330
Message-ID: <CAEHBEOBXzkGTqxQSYqmEFN5hbc=zsGWFpU9h8zf7AAPv4VdOWQ@mail.gmail.com> (raw)
In-Reply-To: <[email protected]>
References: <CAEHBEOBuoMFWuhHM3L_Zr6o1enELju-Vns6Pknt4TT+6MFQOwQ@mail.gmail.com>
	<[email protected]>
	<CAEHBEOD969YrbPH_z9OEmThWx3-w4sMMaHLhZLOQwqCwE8Y58Q@mail.gmail.com>
	<[email protected]>

Laurenz Albe, thanks for your answer.

Right now this data is in MariaDB, on separate databases (schema) but on
one server. The solution in this situation is to have a cross-database
query. (this is the status quo of our application)

Now our team has decided to migrate to Postgres. However, we realized that
Postgres does not support cross-database queries. And if we want to do so,
we should use FDW. So, we thought we might as well put databases on
separate servers for scalability if we have to write more code. That's the
reason behind this question.

But we're stuck at performance. In SQL Server and MariaDB, cross-database
queries allow for neat separation of data while delivering good performance
in the orchestration layer. You have separate databases, which allows for
fine-grained management (different backup schedules, index recalculation,
deployment, etc.) but at the same time you can write a query in your app,
or in an orchestrator database (let's call it All) that is fast enough for
millions of records.

However, we're stuck in this in Postgres. What solutions exist for this
problem?

Regards
Saeed

On Wed, Mar 5, 2025 at 11:09 AM Laurenz Albe <[email protected]>
wrote:

> On Wed, 2025-03-05 at 10:12 +0330, me nefcanto wrote:
> > Adrian Klaver, thank you for the link. I asked the AI to create a query
> for me using FDW.
> >
> > The problem here is that it collects all of the product_id values from
> the ItemCategories table [...]
> >
> > That's not scalable. Is there a workaround for this?
>
> Without having scrutinized the case in detail: if your data are organized
> in an
> entity-attribute-value design and distributed across three databases, you
> cannot
> expect to end up with efficient queries.
>
> Perhaps you can extract the data and load them into a reasonably organized
> single database.  Such an ETL process might make the task much easier.
>
> Yours,
> Laurenz Albe
>

view thread (2+ messages)

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Quesion about querying distributed databases
  In-Reply-To: <CAEHBEOBXzkGTqxQSYqmEFN5hbc=zsGWFpU9h8zf7AAPv4VdOWQ@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox