public inbox for [email protected]  
help / color / mirror / Atom feed
Re: D308-E9AF-4C11 : CONFIRM from pgsql-sql (subscribe)
4+ messages / 4 participants
[nested] [flat]

* Re: D308-E9AF-4C11 : CONFIRM from pgsql-sql (subscribe)
@ 2001-07-27 18:02  Gonzo Rock <[email protected]>
  0 siblings, 2 replies; 4+ messages in thread

From: Gonzo Rock @ 2001-07-27 18:02 UTC (permalink / raw)
  To: pgsql-general

A Question for those of you who consider yourself crack Database Designers.

I am currently moving a large database(100+Tables) into pgSQL... with the intention of deploying against 'any' SQL database in the future. The development side will be rigorously using Standard SQL constructs with no unique/proprietary extensions.

My question concerns establishing the relationships.

Currently Relationships between tables are established via a Unique Integer ID like this:

*=APrimaryKey

 PartTypes    Customer     Parts
 ---------    --------     -----
 PartTypeID   CustomerID   PartID
*PartType    *Customer     PartTypeID
              Address      CustomerID
                          *PartNumber(2FieldPrimaryKey)
                          *PartRevision(2FieldPrimaryKey)
                           PartName
                      

HOWEVER; I have read lots of texts describing the Relational Design should be instead like this:

*=APrimaryKey

 PartTypes    Customer     Parts
 ---------    --------     -----
*PartType    *Customer     PartType  
              Address     *PartNumber(2FieldPrimaryKey)
                          *PartRevison(2FieldPrimaryKey)
                           PartName    
                           Customer
                                          
Both Techniques have a unique foreign key back to the parent tables but one uses No.Meaningful.Info.Integer.Data for the ForeignKey while the second uses Human.Understandable.ForeignKeys

Is one recommended over the other??? Sure appreciate the commentary before I get in too deep with all these tables.

Thanks!





^ permalink  raw  reply  [nested|flat] 4+ messages in thread

* Re: Re: D308-E9AF-4C11 : CONFIRM from pgsql-sql (subscribe)
@ 2001-07-27 18:24  [email protected]
  0 siblings, 0 replies; 4+ messages in thread

From: [email protected] @ 2001-07-27 18:24 UTC (permalink / raw)
  To: Gonzo Rock <[email protected]>; +Cc: pgsql-general



You should use No.Meaningful.Info.Integer.Data for the foreign keys.  If you use
the actual Customer Name for a primary key and foreign keys, and the customer
later changes their name or you have to correct a typo, you risk breaking the
links unless you have all your cascading triggers set up just right.  Use a
meaningless ID and your links stay the same even if the text changes.

Also, you'll probably use the ID's a lot as join criteria in your queries.
Joining on two integer fields should be faster than joining on two character
fields.  If you have any intersection tables for many-to-many relationships that
just store a pair of foreign keys, the intersection table will be much smaller
if it is just storing a pair of integers, as opposed to a pair of
varchar(100)'s.  Just multiply the difference per row by the number of rows
you're likely to have to estimate how much you'll save.  Yes, the varchar's
probably won't take up a whole 100 characters every row, but I imagine there's
also overhead to keep track of the length.




Gonzo Rock <GonzoRock%[email protected]> on 07/27/2001 02:02:15
PM

To:   pgsql-general%[email protected]
cc:    (bcc: Wesley Sheldahl/Lex/Lexmark)
Subject:  [GENERAL] Re: D308-E9AF-4C11 : CONFIRM from pgsql-sql (subscribe)



Both Techniques have a unique foreign key back to the parent tables but one uses
No.Meaningful.Info.Integer.Data for the ForeignKey while the second uses
Human.Understandable.ForeignKeys

Is one recommended over the other??? Sure appreciate the commentary before I get
in too deep with all these tables.

Thanks!



---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to [email protected]








^ permalink  raw  reply  [nested|flat] 4+ messages in thread

* Re: Re: D308-E9AF-4C11 : CONFIRM from pgsql-sql (subscribe)
@ 2001-07-27 18:31  Mike Mascari <[email protected]>
  parent: Gonzo Rock <[email protected]>
  1 sibling, 0 replies; 4+ messages in thread

From: Mike Mascari @ 2001-07-27 18:31 UTC (permalink / raw)
  To: Gonzo Rock <[email protected]>; +Cc: pgsql-general

I prefer using unique integer ids generated from sequences rather than
keys composed of meaningful values.

Advantages:

Client side applications can store/handle the unique integer ids more
readily than having to deal with composite primary keys composed of
varying data types. For example, I can stuff the id associated with a
particular record easily in list boxes, combo boxes, edit controls, etc.
via SetItemData() or some other appropriate method. Its a bit more
complicated to track database records via composite keys of something
like: part no, vendor no, vendor group.

Updating the data doesn't require cascading updates. If you use keys
with meaning, the referential integrity constraints must support
cascading updates so if the key changes in the primary table the change
is cascaded to all referencing tables as well. Earlier versions of most
databases (Access, Oracle, etc.) only provided cascading deletes under
the assumption you would be using sequence generated keys.

Downside:

Many queries might require more joins against the primary table to fetch
the relevant information associated with the numerical id, whereas keys
composed of solely the values with which they are associated might not
require the joins, which will speed some applications. I now have some
queries with 20-way joins. But PostgreSQL provides a way to explicitly
set the path the planner will choose and so the execution of the query
is instantaneous. I'm not sure about other databases. In earlier
versions, I had to denormalize a bit solely for performance reasons.

In the past, I used to use composite keys and switched to the purely
sequence generated path and don't regret it at all. Of course, you'll
still have a unique constraint on the what-would-have-been meaningful
primary key.

Hope that helps,

Mike Mascari
[email protected]

Gonzo Rock wrote:
> 
> A Question for those of you who consider yourself crack Database Designers.
> 
> I am currently moving a large database(100+Tables) into pgSQL... with the intention of deploying against 'any' SQL database in the future. The development side will be rigorously using Standard SQL constructs with no unique/proprietary extensions.
> 
> My question concerns establishing the relationships.
> 
> Currently Relationships between tables are established via a Unique Integer ID like this:
> 
> *=APrimaryKey
> 
>  PartTypes    Customer     Parts
>  ---------    --------     -----
>  PartTypeID   CustomerID   PartID
> *PartType    *Customer     PartTypeID
>               Address      CustomerID
>                           *PartNumber(2FieldPrimaryKey)
>                           *PartRevision(2FieldPrimaryKey)
>                            PartName
> 
> 
> HOWEVER; I have read lots of texts describing the Relational Design should be instead like this:
> 
> *=APrimaryKey
> 
>  PartTypes    Customer     Parts
>  ---------    --------     -----
> *PartType    *Customer     PartType
>               Address     *PartNumber(2FieldPrimaryKey)
>                           *PartRevison(2FieldPrimaryKey)
>                            PartName
>                            Customer
> 
> Both Techniques have a unique foreign key back to the parent tables but one uses No.Meaningful.Info.Integer.Data for the ForeignKey while the second uses Human.Understandable.ForeignKeys
> 
> Is one recommended over the other??? Sure appreciate the commentary before I get in too deep with all these tables.
> 
> Thanks!



^ permalink  raw  reply  [nested|flat] 4+ messages in thread

* Re: Re: D308-E9AF-4C11 : CONFIRM from pgsql-sql (subscribe)
@ 2001-07-27 20:28  Oliver Elphick <[email protected]>
  parent: Gonzo Rock <[email protected]>
  1 sibling, 0 replies; 4+ messages in thread

From: Oliver Elphick @ 2001-07-27 20:28 UTC (permalink / raw)
  To: Gonzo Rock <[email protected]>; +Cc: pgsql-general

Gonzo Rock wrote:
  >Is one recommended over the other??? Sure appreciate the commentary before I
      > get in too deep with all these tables.
 
The second sounds OK, but only if the chosen field is truly a candidate key.
"Customer" does not sound like one - suppose you have two 'John Smith's?
This is why most real-world applications use unique numbers or codes.
Of course you could (probably) differentiate the 'John Smith's by address,
but then the address has to be typed in as well as the name.  A code is
much easier.

It all depends on the nature of the data.

-- 
Oliver Elphick                                [email protected]
Isle of Wight                              http://www.lfix.co.uk/oliver
PGP: 1024R/32B8FAA1: 97 EA 1D 47 72 3F 28 47  6B 7E 39 CC 56 E4 C1 47
GPG: 1024D/3E1D0C1C: CA12 09E0 E8D5 8870 5839  932A 614D 4C34 3E1D 0C1C
                 ========================================
     "But the wisdom that is from above is first pure, then 
      peaceable, gentle, and easy to be intreated, full of 
      mercy and good fruits, without partiality, and without
      hypocrisy."     James 3:17 






^ permalink  raw  reply  [nested|flat] 4+ messages in thread


end of thread, other threads:[~2001-07-27 20:28 UTC | newest]

Thread overview: 4+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2001-07-27 18:02 Re: D308-E9AF-4C11 : CONFIRM from pgsql-sql (subscribe) Gonzo Rock <[email protected]>
2001-07-27 18:31 ` Mike Mascari <[email protected]>
2001-07-27 20:28 ` Oliver Elphick <[email protected]>
2001-07-27 18:24 Re: Re: D308-E9AF-4C11 : CONFIRM from pgsql-sql (subscribe) [email protected]

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox