MIME-Version: 1.0
References: <CA+FnnTzjSDE9E=TF56F-EAp6u=oPH2vmGNrjN50H53dXrev1MA@mail.gmail.com>
 <025C57B4-F1C3-4533-86FD-D7C85EDCF143@gmail.com>
In-Reply-To: <025C57B4-F1C3-4533-86FD-D7C85EDCF143@gmail.com>
From: me nefcanto <sn.1361@gmail.com>
Date: Thu, 6 Mar 2025 11:14:25 +0330
Message-ID: <CAEHBEODHVs2VLkT37iVJ4-QSAnk8x-GuK8Fmsxk=nP2+EycL5g@mail.gmail.com>
Subject: Re: Quesion about querying distributed databases
To: Rob Sargent <robjsargent@gmail.com>
Cc: Igor Korot <ikorot01@gmail.com>, Adrian Klaver <adrian.klaver@aklaver.com>, 
	Laurenz Albe <laurenz.albe@cybertec.at>, 
	"pgsql-generallists.postgresql.org" <pgsql-general@lists.postgresql.org>
Content-Type: multipart/alternative; boundary="00000000000095348a062fa7adae"
Archived-At: <https://www.postgresql.org/message-id/CAEHBEODHVs2VLkT37iVJ4-QSAnk8x-GuK8Fmsxk%3DnP2%2BEycL5g%40mail.gmail.com>
Precedence: bulk

--00000000000095348a062fa7adae
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

I appreciate your time guys. Thank you very much.

> Having 1 table per database per server is too ugly.

Our databases are not one table per database. They are mapped to DDD's
bounded contexts and usually by one table per domain entity.
For example, we have these databases:


   - Contacts
   - Courses
   - Seo
   - Payment
   - Forms
   - Geo
   - Sales
   - Media
   - Taxonomy
   - ...

These are the tables we have in the Contacts database:


   - Addresses
   - AddressTypes
   - Attributes
   - BankAccounts
   - ContactContents
   - Contacts
   - Emails
   - Genders
   - JobTitles
   - JuridicalPersons
   - NaturalPersonRelations
   - NaturalPersons
   - Persons
   - Phones
   - PhoneTypes
   - Relations
   - RelationTypes
   - SocialNetworks
   - SocialProfiles
   - Titles

And, these are the tables we have in the Geo database:


   - AdministrativeDivisions
   - AdministrativeDivisionTypes
   - Cities
   - CityDivisions
   - Countries
   - Locations
   - SpatialDataItems
   - TelephonePrefixes
   - TimeZones

But we also do have databases that only have one table in them. The number
of tables is not our criteria to break them. The business semantics is our
criteria.

> Cross-database on MSSQL is identical to the cross schema on Postgres.

Cross-database query in SQL Server is not equivalent to cross-schema
queries in Postgres. Because SQL Server also has the concept of schemas. In
other words, both SQL Server and Postgres let you create databases, create
schemas inside them, and create tables inside schemas. So SQL Server's
cross-schema query equals Postgres's cross-schema query.

> If you truly need cross server support (versus say beefier hardware) how
did you come to choose postgres?

We chose Postgres for these reasons that we did R&D about:


   - Native array per column support
   - Not having multiple storage engines like MariaDB to be confused about
   - Supporting expressions in unique constraints
   - It's usually considered one of the best when it comes to performance,
   especially in GIS we intend to develop more upon
   - As it claims on its website, it's the most advanced open-source
   database engine (but to be honest, we saw many serious drawbacks to that
   statement)

But here's the deal. We don't have one project only. We don't need
*cross-server
queries* for all of our projects. But we tend to keep our architecture the
same across projects as much as we can. We chose Postgres because we had
experience with SQL Server and MariaDB and assumed that cross-database
query on the same server is something natural. Both of them support that.
And both are very performant on that. On MariaDB all you have to do is to
use `db_name.table_name` and on SQL Server all you have to do is to use
`database_name.schema_name.table_name`. So we thought, for projects that do
not need more than one server, we keep databases on the same server. When
it needed more resources, we start by taking heavy databases onto their own
servers, and we start implementing table partitinong on them.

But we have experienced some amazing improvements too in our initial tests.
For example, creating all databases and tables and database objects on
MariaDB takes more than 400 seconds, while the same took 80 seconds on
Postgres. So amazing performance on DDL.
Also, 1 million records in bulk insertion take almost one-sixth to
on-fourth of the time on MariaDB. These are valuable numbers. They warmed
our hearts to keep digging as much as we can to see if we can perform this
migration.

Regards
Saeed

On Thu, Mar 6, 2025 at 7:14=E2=80=AFAM Rob Sargent <robjsargent@gmail.com> =
wrote:

>
>
> On Mar 5, 2025, at 8:03=E2=80=AFPM, Igor Korot jnit worked great for SQL =
Server.
> If you're small, we host them all on one server. If you get bigger, we ca=
n
> put heavy databases on separate machines.
>
>
>> However, I don't have experience working with other types of database
>> scaling. I have used table partitioning, but I have never used sharding.
>>
>> Anyway, that's why I asked you guys. However, encouraging me to go back
>> to monolith without giving solutions on how to scale, is not helping. To=
 be
>> honest, I'm somehow disappointed by how the most advanced open source
>> database does not support cross-database querying just like how SQL Serv=
er
>> does. But if it doesn't, it doesn't. Our team should either drop it as a
>> choice or find a way (by asking the experts who built it or use it) how =
to
>> design based on its features. That's why I'm asking.
>>
>>
> Cross-database on MSSQL is identical to cross schema on postgres. If you
> truly need cross server support (versus say beefier hardware) how did you
> come to choose postgres?  The numbers you present are impressive but not
> unheard of on this list.
>
>

--00000000000095348a062fa7adae
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div class=3D"gmail_default" style=3D"font-family:tahoma,s=
ans-serif">I appreciate your time guys. Thank you very much.</div><div clas=
s=3D"gmail_default" style=3D"font-family:tahoma,sans-serif"><br></div><div =
class=3D"gmail_default" style=3D"font-family:tahoma,sans-serif">&gt;=C2=A0<=
span style=3D"font-family:Arial,Helvetica,sans-serif">Having 1 table per da=
tabase per server is too ugly.</span></div><div class=3D"gmail_default" sty=
le=3D"font-family:tahoma,sans-serif"><span style=3D"font-family:Arial,Helve=
tica,sans-serif"><br></span></div><div class=3D"gmail_default" style=3D"fon=
t-family:tahoma,sans-serif">Our databases are not one table per database. T=
hey are mapped to DDD&#39;s bounded contexts and usually by one table per d=
omain entity.</div><div class=3D"gmail_default" style=3D"font-family:tahoma=
,sans-serif">For example, we have these databases:</div><div class=3D"gmail=
_default" style=3D"font-family:tahoma,sans-serif"><br></div><div class=3D"g=
mail_default" style=3D"font-family:tahoma,sans-serif"><ul><li>Contacts</li>=
<li>Courses</li><li>Seo</li><li>Payment</li><li>Forms</li><li>Geo</li><li>S=
ales</li><li>Media</li><li>Taxonomy</li><li>...</li></ul><div>These are the=
 tables we have in the Contacts database:</div><div><br></div><div><ul><li>=
Addresses</li><li>AddressTypes</li><li>Attributes</li><li>BankAccounts</li>=
<li>ContactContents</li><li>Contacts</li><li>Emails</li><li>Genders</li><li=
>JobTitles</li><li>JuridicalPersons</li><li>NaturalPersonRelations</li><li>=
NaturalPersons</li><li>Persons</li><li>Phones</li><li>PhoneTypes</li><li>Re=
lations</li><li>RelationTypes</li><li>SocialNetworks</li><li>SocialProfiles=
</li><li>Titles</li></ul></div><div>And, these are the tables we have in th=
e Geo database:</div><div><br></div><div><ul><li>AdministrativeDivisions</l=
i><li>AdministrativeDivisionTypes</li><li>Cities</li><li>CityDivisions</li>=
<li>Countries</li><li>Locations</li><li>SpatialDataItems</li><li>TelephoneP=
refixes</li><li>TimeZones</li></ul><div>But we also do have databases that =
only have one table in them. The number of tables is not our criteria to br=
eak them. The business semantics is our criteria.</div></div><div><br></div=
><div>&gt;=C2=A0<span style=3D"font-family:Arial,Helvetica,sans-serif">Cros=
s-database on MSSQL is identical to the cross schema on Postgres.</span></d=
iv><div><br></div><div>Cross-database query in SQL Server is not equivalent=
 to cross-schema queries in Postgres. Because SQL Server also has the conce=
pt of schemas. In other words, both SQL Server and Postgres let you create =
databases, create schemas inside them, and create tables inside schemas. So=
 SQL Server&#39;s cross-schema query equals Postgres&#39;s cross-schema que=
ry.</div><div><br></div><div>&gt;=C2=A0<span style=3D"font-family:Arial,Hel=
vetica,sans-serif">If you truly need cross server support (versus say beefi=
er hardware) how did you come to choose postgres?</span></div><div><span st=
yle=3D"font-family:Arial,Helvetica,sans-serif"><br></span></div><div>We cho=
se Postgres for these reasons that we did R&amp;D about:</div><div><br></di=
v><div><ul><li>Native array per column support</li><li>Not having multiple =
storage engines like MariaDB to be confused about</li><li>Supporting expres=
sions in unique constraints</li><li>It&#39;s usually considered one of the =
best when it comes to performance, especially in GIS we intend to develop m=
ore upon</li><li>As it claims on its website, it&#39;s the most advanced op=
en-source database engine (but to be honest, we saw many serious drawbacks =
to that statement)</li></ul><div>But here&#39;s the deal. We don&#39;t have=
 one project only. We don&#39;t need <b>cross-server queries</b>=C2=A0for a=
ll of our projects. But we tend to keep our architecture the same across pr=
ojects as much as we can. We chose Postgres because we had experience with =
SQL Server and MariaDB and assumed that cross-database query on the same se=
rver is something natural. Both of them support that. And both are very per=
formant on that. On MariaDB all you have to do is to use `db_name.table_nam=
e` and on SQL Server all you have to do is to use `database_name.schema_nam=
e.table_name`. So we thought, for projects that do not need more than one s=
erver, we keep databases on the same server. When it needed more resources,=
 we start by taking heavy databases onto their own servers, and we start im=
plementing table partitinong on them.</div></div><div><br></div><div>But we=
 have experienced some amazing improvements too in our initial tests. For e=
xample, creating all databases and tables and database objects on MariaDB t=
akes more than 400 seconds, while the same took 80 seconds on Postgres. So =
amazing performance on DDL.</div><div>Also, 1 million records in bulk inser=
tion take almost one-sixth to on-fourth of the time on MariaDB. These are v=
aluable numbers. They warmed our hearts to keep digging as much as we can t=
o see if we can perform this migration.</div><div><br></div><div>Regards</d=
iv><div>Saeed</div></div></div><br><div class=3D"gmail_quote gmail_quote_co=
ntainer"><div dir=3D"ltr" class=3D"gmail_attr">On Thu, Mar 6, 2025 at 7:14=
=E2=80=AFAM Rob Sargent &lt;<a href=3D"mailto:robjsargent@gmail.com">robjsa=
rgent@gmail.com</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" s=
tyle=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);pad=
ding-left:1ex"><div dir=3D"auto"><div dir=3D"ltr"></div><div dir=3D"ltr"><b=
r></div><div dir=3D"ltr"><br><blockquote type=3D"cite">On Mar 5, 2025, at 8=
:03=E2=80=AFPM, Igor Korot jn<span style=3D"font-family:tahoma,sans-serif">=
it worked great for SQL Server. If you&#39;re small, we host them all on on=
e server. If you get bigger, we can put heavy databases on separate machine=
s.</span></blockquote></div><blockquote type=3D"cite"><div dir=3D"ltr"><div=
 dir=3D"auto"><div dir=3D"auto"><div class=3D"gmail_quote"><blockquote clas=
s=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid r=
gb(204,204,204);padding-left:1ex"><div dir=3D"ltr"><div class=3D"gmail_defa=
ult" style=3D"font-family:tahoma,sans-serif"><br></div><div class=3D"gmail_=
default" style=3D"font-family:tahoma,sans-serif">However, I don&#39;t have =
experience working with other types of database scaling. I have used table =
partitioning, but I have never used sharding.</div><div class=3D"gmail_defa=
ult" style=3D"font-family:tahoma,sans-serif"><br></div><div class=3D"gmail_=
default" style=3D"font-family:tahoma,sans-serif">Anyway, that&#39;s why I a=
sked you guys. However, encouraging me to go back to monolith without givin=
g solutions on how to scale, is not helping. To be honest, I&#39;m somehow =
disappointed by how the most advanced open source database does not support=
 cross-database querying just like how SQL Server does. But if it doesn&#39=
;t, it doesn&#39;t. Our team should either drop it as a choice or find a wa=
y (by asking the experts who built it or use it) how to design based on its=
 features. That&#39;s why I&#39;m asking.</div><div class=3D"gmail_default"=
 style=3D"font-family:tahoma,sans-serif"><br></div></div></blockquote></div=
></div></div></div></blockquote><div><br></div>Cross-database on MSSQL is i=
dentical to cross schema on postgres. If you truly need cross server suppor=
t (versus say beefier hardware) how did you come to choose postgres?=C2=A0 =
The numbers you present are impressive but not unheard of on this list.=C2=
=A0<div><br></div></div></blockquote></div>

--00000000000095348a062fa7adae--