public inbox for [email protected]  
help / color / mirror / Atom feed
From: David G. Johnston <[email protected]>
To: Lok P <[email protected]>
Cc: pgsql-general <[email protected]>
Subject: Re: Design strategy for table with many attributes
Date: Thu, 4 Jul 2024 22:36:03 -0700
Message-ID: <CAKFQuwZHHHF7A-5AhC-p-H4cquZSZq-JTGWg0Q+51Drfwc8VCQ@mail.gmail.com> (raw)
In-Reply-To: <CAKna9VZdihb=n9mNGaCZLKYbWAoS0+knfi_PAfLC5w=Mj5Egig@mail.gmail.com>
References: <CAKna9VbQ=o2Yhjo1EitaxzYpWAXe9vw6QtamzCsNNp2-rQOFSA@mail.gmail.com>
	<CAKFQuwYpai5snt5mEgyXK4m6nR5T6sP-YTYH-GQ0Hs2rXS7uLA@mail.gmail.com>
	<CAKna9VZdihb=n9mNGaCZLKYbWAoS0+knfi_PAfLC5w=Mj5Egig@mail.gmail.com>

On Thursday, July 4, 2024, Lok P <[email protected]> wrote:

>
> But do you also suggest keeping those table pieces related to each other
> through the same primary key ?
>
>
Yes, everyone row produced from the input data “row” should get the same ID
associated with it - either as an entire PK or a component of a
multi-column PK/unique index.

>
>
>  Won't there be a problem when we load the data like say for example , in
> normal scenario the data load will be to one table but when we break it to
> multiple tables it will happen to all the individual pieces, won't that
> cause additional burden to the data load?
>

Yes, doing this requires additional CPU time to perform the work.  I’d say
IO is hopefully a wash.


>
> Also I understand the technical limitation of the max number of
> columns per table is ~1600. But should you advise to restrict/stop us to
> some low number long before reaching that limit , such that we will not
> face any anomalies when we grow in future.
>

In a row-oriented system wider is worser.  I start evaluation of table
design with that in mind at the fourth column (including the surrogate key
that is usually present, and the natural key - ignoring auditing columns.)


>
>  And if we should maintain any specific order in the columns from start to
> end column in the specific table?
>

There is material out there on micro-optimizing column ordering to match
with alignment boundaries.  I’d the benefit is meaningful but there is a
non-trivial cost to actually setup the testing to verify that what you’ve
figured out is working.  Never actually done it myself.  Though it actually
seems like something someone could/may have written an algorithm for
(though I do not recall ever seeing mention of one.)

David J.


view thread (4+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected]
  Subject: Re: Design strategy for table with many attributes
  In-Reply-To: <CAKFQuwZHHHF7A-5AhC-p-H4cquZSZq-JTGWg0Q+51Drfwc8VCQ@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox