Re: Quesion about querying distributed databases

public inbox for [email protected]  
help / color / mirror / Atom feed

Re: Quesion about querying distributed databases
2+ messages / 2 participants
[nested] [flat]

* Re: Quesion about querying distributed databases
@ 2025-03-06 07:44  me nefcanto <[email protected]>
  0 siblings, 1 reply; 2+ messages in thread

From: me nefcanto @ 2025-03-06 07:44 UTC (permalink / raw)
  To: Rob Sargent <[email protected]>; +Cc: Igor Korot <[email protected]>; Adrian Klaver <[email protected]>; Laurenz Albe <[email protected]>; pgsql-generallists.postgresql.org <[email protected]>

I appreciate your time guys. Thank you very much.

> Having 1 table per database per server is too ugly.

Our databases are not one table per database. They are mapped to DDD's
bounded contexts and usually by one table per domain entity.
For example, we have these databases:

   - Contacts
   - Courses
   - Seo
   - Payment
   - Forms
   - Geo
   - Sales
   - Media
   - Taxonomy
   - ...

These are the tables we have in the Contacts database:

   - Addresses
   - AddressTypes
   - Attributes
   - BankAccounts
   - ContactContents
   - Contacts
   - Emails
   - Genders
   - JobTitles
   - JuridicalPersons
   - NaturalPersonRelations
   - NaturalPersons
   - Persons
   - Phones
   - PhoneTypes
   - Relations
   - RelationTypes
   - SocialNetworks
   - SocialProfiles
   - Titles

And, these are the tables we have in the Geo database:

   - AdministrativeDivisions
   - AdministrativeDivisionTypes
   - Cities
   - CityDivisions
   - Countries
   - Locations
   - SpatialDataItems
   - TelephonePrefixes
   - TimeZones

But we also do have databases that only have one table in them. The number
of tables is not our criteria to break them. The business semantics is our
criteria.

> Cross-database on MSSQL is identical to the cross schema on Postgres.

Cross-database query in SQL Server is not equivalent to cross-schema
queries in Postgres. Because SQL Server also has the concept of schemas. In
other words, both SQL Server and Postgres let you create databases, create
schemas inside them, and create tables inside schemas. So SQL Server's
cross-schema query equals Postgres's cross-schema query.

> If you truly need cross server support (versus say beefier hardware) how
did you come to choose postgres?

We chose Postgres for these reasons that we did R&D about:

   - Native array per column support
   - Not having multiple storage engines like MariaDB to be confused about
   - Supporting expressions in unique constraints
   - It's usually considered one of the best when it comes to performance,
   especially in GIS we intend to develop more upon
   - As it claims on its website, it's the most advanced open-source
   database engine (but to be honest, we saw many serious drawbacks to that
   statement)

But here's the deal. We don't have one project only. We don't need
*cross-server
queries* for all of our projects. But we tend to keep our architecture the
same across projects as much as we can. We chose Postgres because we had
experience with SQL Server and MariaDB and assumed that cross-database
query on the same server is something natural. Both of them support that.
And both are very performant on that. On MariaDB all you have to do is to
use `db_name.table_name` and on SQL Server all you have to do is to use
`database_name.schema_name.table_name`. So we thought, for projects that do
not need more than one server, we keep databases on the same server. When
it needed more resources, we start by taking heavy databases onto their own
servers, and we start implementing table partitinong on them.

But we have experienced some amazing improvements too in our initial tests.
For example, creating all databases and tables and database objects on
MariaDB takes more than 400 seconds, while the same took 80 seconds on
Postgres. So amazing performance on DDL.
Also, 1 million records in bulk insertion take almost one-sixth to
on-fourth of the time on MariaDB. These are valuable numbers. They warmed
our hearts to keep digging as much as we can to see if we can perform this
migration.

Regards
Saeed

On Thu, Mar 6, 2025 at 7:14 AM Rob Sargent <[email protected]> wrote:

>
>
> On Mar 5, 2025, at 8:03 PM, Igor Korot jnit worked great for SQL Server.
> If you're small, we host them all on one server. If you get bigger, we can
> put heavy databases on separate machines.
>
>
>> However, I don't have experience working with other types of database
>> scaling. I have used table partitioning, but I have never used sharding.
>>
>> Anyway, that's why I asked you guys. However, encouraging me to go back
>> to monolith without giving solutions on how to scale, is not helping. To be
>> honest, I'm somehow disappointed by how the most advanced open source
>> database does not support cross-database querying just like how SQL Server
>> does. But if it doesn't, it doesn't. Our team should either drop it as a
>> choice or find a way (by asking the experts who built it or use it) how to
>> design based on its features. That's why I'm asking.
>>
>>
> Cross-database on MSSQL is identical to cross schema on postgres. If you
> truly need cross server support (versus say beefier hardware) how did you
> come to choose postgres?  The numbers you present are impressive but not
> unheard of on this list.
>
>

^ permalink  raw  reply  [nested|flat] 2+ messages in thread

* Re: Quesion about querying distributed databases
@ 2025-03-06 09:18  Igor Korot <[email protected]>
  parent: me nefcanto <[email protected]>
  0 siblings, 0 replies; 2+ messages in thread

From: Igor Korot @ 2025-03-06 09:18 UTC (permalink / raw)
  To: me nefcanto <[email protected]>; +Cc: Rob Sargent <[email protected]>; Adrian Klaver <[email protected]>; Laurenz Albe <[email protected]>; pgsql-generallists.postgresql.org <[email protected]>

Hi,

On Thu, Mar 6, 2025, 1:44 AM me nefcanto <[email protected]> wrote:

> I appreciate your time guys. Thank you very much.
>
> > Having 1 table per database per server is too ugly.
>
> Our databases are not one table per database. They are mapped to DDD's
> bounded contexts and usually by one table per domain entity.
> For example, we have these databases:
>
>
>    - Contacts
>    - Courses
>    - Seo
>    - Payment
>    - Forms
>    - Geo
>    - Sales
>    - Media
>    - Taxonomy
>    - ...
>
> These are the tables we have in the Contacts database:
>
>
>    - Addresses
>    - AddressTypes
>    - Attributes
>    - BankAccounts
>    - ContactContents
>    - Contacts
>    - Emails
>    - Genders
>    - JobTitles
>    - JuridicalPersons
>    - NaturalPersonRelations
>    - NaturalPersons
>    - Persons
>    - Phones
>    - PhoneTypes
>    - Relations
>    - RelationTypes
>    - SocialNetworks
>    - SocialProfiles
>    - Titles
>
> And, these are the tables we have in the Geo database:
>
>
>    - AdministrativeDivisions
>    - AdministrativeDivisionTypes
>    - Cities
>    - CityDivisions
>    - Countries
>    - Locations
>    - SpatialDataItems
>    - TelephonePrefixes
>    - TimeZones
>
> But we also do have databases that only have one table in them. The number
> of tables is not our criteria to break them. The business semantics is our
> criteria.
>
> > Cross-database on MSSQL is identical to the cross schema on Postgres.
>
> Cross-database query in SQL Server is not equivalent to cross-schema
> queries in Postgres. Because SQL Server also has the concept of schemas. In
> other words, both SQL Server and Postgres let you create databases, create
> schemas inside them, and create tables inside schemas. So SQL Server's
> cross-schema query equals Postgres's cross-schema query.
>
> > If you truly need cross server support (versus say beefier hardware)
> how did you come to choose postgres?
>
> We chose Postgres for these reasons that we did R&D about:
>
>
>    - Native array per column support
>    - Not having multiple storage engines like MariaDB to be confused about
>    - Supporting expressions in unique constraints
>    - It's usually considered one of the best when it comes to
>    performance, especially in GIS we intend to develop more upon
>    - As it claims on its website, it's the most advanced open-source
>    database engine (but to be honest, we saw many serious drawbacks to that
>    statement)
>
> But here's the deal. We don't have one project only. We don't need *cross-server
> queries* for all of our projects. But we tend to keep our architecture
> the same across projects as much as we can. We chose Postgres because we
> had experience with SQL Server and MariaDB and assumed that cross-database
> query on the same server is something natural. Both of them support that.
> And both are very performant on that. On MariaDB all you have to do is to
> use `db_name.table_name` and on SQL Server all you have to do is to use
> `database_name.schema_name.table_name`. So we thought, for projects that do
> not need more than one server, we keep databases on the same server. When
> it needed more resources, we start by taking heavy databases onto their own
> servers, and we start implementing table partitinong on them.
>

But why?
Remember - multiple servers means more traffic which might be performance
wise.
And especially if thise servers are located on different hardware as your
OP implied.

Thank you.


> But we have experienced some amazing improvements too in our initial
> tests. For example, creating all databases and tables and database objects
> on MariaDB takes more than 400 seconds, while the same took 80 seconds on
> Postgres. So amazing performance on DDL.
> Also, 1 million records in bulk insertion take almost one-sixth to
> on-fourth of the time on MariaDB. These are valuable numbers. They warmed
> our hearts to keep digging as much as we can to see if we can perform this
> migration.
>
> Regards
> Saeed
>
> On Thu, Mar 6, 2025 at 7:14 AM Rob Sargent <[email protected]> wrote:
>
>>
>>
>> On Mar 5, 2025, at 8:03 PM, Igor Korot jnit worked great for SQL Server.
>> If you're small, we host them all on one server. If you get bigger, we can
>> put heavy databases on separate machines.
>>
>>
>>> However, I don't have experience working with other types of database
>>> scaling. I have used table partitioning, but I have never used sharding.
>>>
>>> Anyway, that's why I asked you guys. However, encouraging me to go back
>>> to monolith without giving solutions on how to scale, is not helping. To be
>>> honest, I'm somehow disappointed by how the most advanced open source
>>> database does not support cross-database querying just like how SQL Server
>>> does. But if it doesn't, it doesn't. Our team should either drop it as a
>>> choice or find a way (by asking the experts who built it or use it) how to
>>> design based on its features. That's why I'm asking.
>>>
>>>
>> Cross-database on MSSQL is identical to cross schema on postgres. If you
>> truly need cross server support (versus say beefier hardware) how did you
>> come to choose postgres?  The numbers you present are impressive but not
>> unheard of on this list.
>>
>>


^ permalink  raw  reply  [nested|flat] 2+ messages in thread

end of thread, other threads:[~2025-03-06 09:18 UTC | newest]

Thread overview: 2+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2025-03-06 07:44 Re: Quesion about querying distributed databases me nefcanto <[email protected]>
2025-03-06 09:18 ` Igor Korot <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox