Re: Quesion about querying distributed databases

public inbox for [email protected]  
help / color / mirror / Atom feed

From: Igor Korot <[email protected]>
To: me nefcanto <[email protected]>
Cc: Rob Sargent <[email protected]>
Cc: Adrian Klaver <[email protected]>
Cc: Laurenz Albe <[email protected]>
Cc: pgsql-generallists.postgresql.org <[email protected]>
Subject: Re: Quesion about querying distributed databases
Date: Thu, 6 Mar 2025 03:18:29 -0600
Message-ID: <CA+FnnTxvThOEwsStmFeXcCGKsSeX9Xd81D-rHz6OqWGuRBF=Zw@mail.gmail.com> (raw)
In-Reply-To: <CAEHBEODHVs2VLkT37iVJ4-QSAnk8x-GuK8Fmsxk=nP2+EycL5g@mail.gmail.com>
References: <CA+FnnTzjSDE9E=TF56F-EAp6u=oPH2vmGNrjN50H53dXrev1MA@mail.gmail.com>
	<[email protected]>
	<CAEHBEODHVs2VLkT37iVJ4-QSAnk8x-GuK8Fmsxk=nP2+EycL5g@mail.gmail.com>

Hi,

On Thu, Mar 6, 2025, 1:44 AM me nefcanto <[email protected]> wrote:

> I appreciate your time guys. Thank you very much.
>
> > Having 1 table per database per server is too ugly.
>
> Our databases are not one table per database. They are mapped to DDD's
> bounded contexts and usually by one table per domain entity.
> For example, we have these databases:
>
>
>    - Contacts
>    - Courses
>    - Seo
>    - Payment
>    - Forms
>    - Geo
>    - Sales
>    - Media
>    - Taxonomy
>    - ...
>
> These are the tables we have in the Contacts database:
>
>
>    - Addresses
>    - AddressTypes
>    - Attributes
>    - BankAccounts
>    - ContactContents
>    - Contacts
>    - Emails
>    - Genders
>    - JobTitles
>    - JuridicalPersons
>    - NaturalPersonRelations
>    - NaturalPersons
>    - Persons
>    - Phones
>    - PhoneTypes
>    - Relations
>    - RelationTypes
>    - SocialNetworks
>    - SocialProfiles
>    - Titles
>
> And, these are the tables we have in the Geo database:
>
>
>    - AdministrativeDivisions
>    - AdministrativeDivisionTypes
>    - Cities
>    - CityDivisions
>    - Countries
>    - Locations
>    - SpatialDataItems
>    - TelephonePrefixes
>    - TimeZones
>
> But we also do have databases that only have one table in them. The number
> of tables is not our criteria to break them. The business semantics is our
> criteria.
>
> > Cross-database on MSSQL is identical to the cross schema on Postgres.
>
> Cross-database query in SQL Server is not equivalent to cross-schema
> queries in Postgres. Because SQL Server also has the concept of schemas. In
> other words, both SQL Server and Postgres let you create databases, create
> schemas inside them, and create tables inside schemas. So SQL Server's
> cross-schema query equals Postgres's cross-schema query.
>
> > If you truly need cross server support (versus say beefier hardware)
> how did you come to choose postgres?
>
> We chose Postgres for these reasons that we did R&D about:
>
>
>    - Native array per column support
>    - Not having multiple storage engines like MariaDB to be confused about
>    - Supporting expressions in unique constraints
>    - It's usually considered one of the best when it comes to
>    performance, especially in GIS we intend to develop more upon
>    - As it claims on its website, it's the most advanced open-source
>    database engine (but to be honest, we saw many serious drawbacks to that
>    statement)
>
> But here's the deal. We don't have one project only. We don't need *cross-server
> queries* for all of our projects. But we tend to keep our architecture
> the same across projects as much as we can. We chose Postgres because we
> had experience with SQL Server and MariaDB and assumed that cross-database
> query on the same server is something natural. Both of them support that.
> And both are very performant on that. On MariaDB all you have to do is to
> use `db_name.table_name` and on SQL Server all you have to do is to use
> `database_name.schema_name.table_name`. So we thought, for projects that do
> not need more than one server, we keep databases on the same server. When
> it needed more resources, we start by taking heavy databases onto their own
> servers, and we start implementing table partitinong on them.
>

But why?
Remember - multiple servers means more traffic which might be performance
wise.
And especially if thise servers are located on different hardware as your
OP implied.

Thank you.


> But we have experienced some amazing improvements too in our initial
> tests. For example, creating all databases and tables and database objects
> on MariaDB takes more than 400 seconds, while the same took 80 seconds on
> Postgres. So amazing performance on DDL.
> Also, 1 million records in bulk insertion take almost one-sixth to
> on-fourth of the time on MariaDB. These are valuable numbers. They warmed
> our hearts to keep digging as much as we can to see if we can perform this
> migration.
>
> Regards
> Saeed
>
> On Thu, Mar 6, 2025 at 7:14 AM Rob Sargent <[email protected]> wrote:
>
>>
>>
>> On Mar 5, 2025, at 8:03 PM, Igor Korot jnit worked great for SQL Server.
>> If you're small, we host them all on one server. If you get bigger, we can
>> put heavy databases on separate machines.
>>
>>
>>> However, I don't have experience working with other types of database
>>> scaling. I have used table partitioning, but I have never used sharding.
>>>
>>> Anyway, that's why I asked you guys. However, encouraging me to go back
>>> to monolith without giving solutions on how to scale, is not helping. To be
>>> honest, I'm somehow disappointed by how the most advanced open source
>>> database does not support cross-database querying just like how SQL Server
>>> does. But if it doesn't, it doesn't. Our team should either drop it as a
>>> choice or find a way (by asking the experts who built it or use it) how to
>>> design based on its features. That's why I'm asking.
>>>
>>>
>> Cross-database on MSSQL is identical to cross schema on postgres. If you
>> truly need cross server support (versus say beefier hardware) how did you
>> come to choose postgres?  The numbers you present are impressive but not
>> unheard of on this list.
>>
>>

view thread (2+ messages)

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Quesion about querying distributed databases
  In-Reply-To: <CA+FnnTxvThOEwsStmFeXcCGKsSeX9Xd81D-rHz6OqWGuRBF=Zw@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox