Re: Design strategy for table with many attributes

public inbox for [email protected]  
help / color / mirror / Atom feed

Re: Design strategy for table with many attributes
3+ messages / 2 participants
[nested] [flat]

* Re: Design strategy for table with many attributes
@ 2024-07-05 07:53  Lok P <[email protected]>
  0 siblings, 1 reply; 3+ messages in thread

From: Lok P @ 2024-07-05 07:53 UTC (permalink / raw)
  To: Guyren Howe <[email protected]>; pgsql-general <[email protected]>

Some folks in the team suggested to have key business attributes or say
frequently queried attributes in individual columns and others in a column
in same table clubbed in JSON format. Is that advisable or any issues can
occur with this approach? Also not sure how effectively postgres processes
JSON (both read and write perspective) as compared to normal column in a
oltp environment. Please advise.

As David suggested it breaks if a row exceeds the 8k limit I. E a single
page size , will that still holds true if we have a column with JSON in it?

On Fri, 5 Jul, 2024, 12:04 pm Guyren Howe, <[email protected]> wrote:

> On Jul 4, 2024, at 23:28, Lok P <[email protected]> wrote:
>
>
>
> *"Note that you might want to split up the “parent” table if that
> naturally groups its columns together for certain uses. In that case, you
> could have the same pk on all the 1:1 tables you then have. In that case,
> the pk for each of those tables is also the fk."*
> Do you mean having a real FK created through DDL and maintaining it or
> just assume it and no need to define it for all the pieces/tables. Only
> keep the same PK across all the pieces and as we know these are related to
> the same transaction and are logically related?
>
>
> A primary key *names something*. Often it’s a kind of platonic
> representation of a real thing — say, a person.
>
> I might use a person’s login columns in some functions, and the person’s
> name, birth date, etc in other functions.
>
> Rather than have one table, I should split this into two, but use the same
> primary key (I would either name both id or both, say, person_id,
> irrespective of the name of the table, so it’s clear you’re doing this).
>
> You can just do a join on the mutual primary keys as you’d expect. In
> fact, if you name them the same, you can just use NATURAL JOIN.
>
> So you’d have person_details and person_login tables, and have a person_id
> pk for both.
>


^ permalink  raw  reply  [nested|flat] 3+ messages in thread

* Re: Design strategy for table with many attributes
@ 2024-07-05 08:13  David Rowley <[email protected]>
  parent: Lok P <[email protected]>
  0 siblings, 1 reply; 3+ messages in thread

From: David Rowley @ 2024-07-05 08:13 UTC (permalink / raw)
  To: Lok P <[email protected]>; +Cc: Guyren Howe <[email protected]>; pgsql-general <[email protected]>

On Fri, 5 Jul 2024 at 19:53, Lok P <[email protected]> wrote:
> As David suggested it breaks if a row exceeds the 8k limit I. E a single page size , will that still holds true if we have a column with JSON in it?

You wouldn't be at risk of the same tuple length problem if you
reduced the column count and stored the additional details in JSON.
Each varlena column is either stored in the tuple inline, or toasted
and stored out of line. Out of line values need an 18-byte pointer to
the toasted data. That pointer contributes to the tuple length.

This isn't me advocating for JSON. I'm just explaining the
limitations. I think I'd only advocate for JSON if the properties you
need to store vary wildly between each tuple.  There's a large
overhead to storing JSON labels, which you'd pay the price of for each
tuple. That sounds like it would scale terribly with the data volumes
you've suggested you'll be processing.

David

^ permalink  raw  reply  [nested|flat] 3+ messages in thread

* Re: Design strategy for table with many attributes
@ 2024-07-05 08:53  Lok P <[email protected]>
  parent: David Rowley <[email protected]>
  0 siblings, 0 replies; 3+ messages in thread

From: Lok P @ 2024-07-05 08:53 UTC (permalink / raw)
  To: David Rowley <[email protected]>; +Cc: Guyren Howe <[email protected]>; pgsql-general <[email protected]>

On Fri, 5 Jul, 2024, 1:44 pm David Rowley, <[email protected]> wrote:

> On Fri, 5 Jul 2024 at 19:53, Lok P <[email protected]> wrote:
> > As David suggested it breaks if a row exceeds the 8k limit I. E a single
> page size , will that still holds true if we have a column with JSON in it?
>
> You wouldn't be at risk of the same tuple length problem if you
> reduced the column count and stored the additional details in JSON.
> Each varlena column is either stored in the tuple inline, or toasted
> and stored out of line. Out of line values need an 18-byte pointer to
> the toasted data. That pointer contributes to the tuple length.
>
>
> David
>

Got it. Thank you very much.

So there would be performance overhead with JSON and we need to validate
that carefully, if at all going in that direction.

However out of curiosity, if the roasted/compressed component or column
which is JSON itself goes beyond 8k post compression, will it break then?


^ permalink  raw  reply  [nested|flat] 3+ messages in thread

end of thread, other threads:[~2024-07-05 08:53 UTC | newest]

Thread overview: 3+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2024-07-05 07:53 Re: Design strategy for table with many attributes Lok P <[email protected]>
2024-07-05 08:13 ` David Rowley <[email protected]>
2024-07-05 08:53   ` Lok P <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox