This discussion is archived
1 2 3 4 Previous Next 51 Replies Latest reply: May 11, 2013 11:02 AM by Debbie_1004338a Go to original post RSS
  • 45. Re: Is using Natural Keys bad?
    Brian Bontrager Expert
    Currently Being Moderated
    Can I have a "Derived Key" be my PK, and use a "Natural Key" as a secondary PK which I use to do my table joins?
    Yep.


    Edit: I'll expand and amend my "Yep", slightly. Say this out loud: "Secondary Primary Key". Doesn't sound right does it? Is it Primary?, or is it Secondary (another key that could also be used)?

    From a SQL semantics/technical syntax standpoint, having multiple unique keys is fine. (There are philosophical differences, as this thread has shown, and maybe performance differences). You can use any of them for joins, and the database will get the result set you need. ONE of the keys may be designated a Primary Key: Primary Keys have extra meaning to the database engine are may be treated differently -- that gets into how the query optimizer for a given database makes its decisions, and is off-topic of this thread.

    Your ideas will likely work either way. You may find one approach performs better than the other, depending on how much data you are working with. Set up both designs in your test environment, experiment, and get empirical evidence for each design decision. Now you have performance results that support one approach or the other, in your environment.

    Edited by: Brian Bontrager on May 10, 2013 4:05 PM
  • 46. Re: Is using Natural Keys bad?
    Debbie_1004338a Newbie
    Currently Being Moderated
    Brian Bontrager wrote:
    Can I have a "Derived Key" be my PK, and use a "Natural Key" as a secondary PK which I use to do my table joins?
    Yep.
    I just found out something really awesome from a friend...

    Apparently I did something wrong in the past, and he says that in MySQL it is possible to have an "id" field that uses AutoIncrement (i.e. "Derived Key"), and then make my "Natural Key" be my true Primary Key.

    (I think what I did wrong was you have to have a "Unique Index" on the "id" field with the AutoIncrement?!)


    Doing that would be better, because then I have my "Every record should have a Numeric ID to identify it", plus I can use a "Natural Key" - where it makes sense - to serve as a true PK for table joins. (And, of course, there will be times where I will want to have an "id" field be my PK too.)

    As I see it, the more options you have, the better off you are!!

    What does everyone else think about this new idea I had??

    Sincerely,


    Debbie
  • 47. Re: Is using Natural Keys bad?
    EdStevens Guru
    Currently Being Moderated
    Debbie_1004338a wrote:
    Brian Bontrager wrote:
    Can I have a "Derived Key" be my PK, and use a "Natural Key" as a secondary PK which I use to do my table joins?
    Yep.
    I just found out something really awesome from a friend...

    Apparently I did something wrong in the past, and he says that in MySQL it is possible to have an "id" field that uses AutoIncrement (i.e. "Derived Key"), and then make my "Natural Key" be my true Primary Key.

    (I think what I did wrong was you have to have a "Unique Index" on the "id" field with the AutoIncrement?!)


    Doing that would be better, because then I have my "Every record should have a Numeric ID to identify it", plus I can use a "Natural Key" - where it makes sense - to serve as a true PK for table joins. (And, of course, there will be times where I will want to have an "id" field be my PK too.)

    As I see it, the more options you have, the better off you are!!

    What does everyone else think about this new idea I had??

    Sincerely,


    Debbie
    I think I would tend to go the other direction. Make the 'autoincrement' (in oracle, that means populating the column with a sequence) the defined PK, then for your 'natural key' define a unique, not null index.
  • 48. Re: Is using Natural Keys bad?
    Debbie_1004338a Newbie
    Currently Being Moderated
    EdStevens wrote:

    I think I would tend to go the other direction. Make the 'autoincrement' (in oracle, that means populating the column with a sequence) the defined PK, then for your 'natural key' define a unique, not null index.
    Why that way?

    And if I did that, then could I confidently join tables using the (Unique) "Natural Key"?

    (Remember, in this scenario, my only use for the AutoIncrement is to have a handy Numeric ID for each record. My goal is to use the "Natural Key" for joins, and to be able to visually identify a record in junction tables.)

    For my particular Data Model, I would rather be able to use a record in my Junction Table if it read like this...
    Legal          Business Structure     Sole Proprietorship     WhenASoleProprietorGetsSued.php
    Because if it looked like this...
    2          1     1     37
    ...then I'd need to do a series of JOINS to get something human-readable.

    Sincerely,


    Debbie
  • 49. Re: Is using Natural Keys bad?
    Brian Bontrager Expert
    Currently Being Moderated
    If a human-readable database is most important, then use a natural key in the junction table.

    If minimizing storage is most important, then use something short, like a number, in the junction table.

    If performance is most important, try both, keeping in mind that number comparisons take fewer CPU cycles than string comparisons, and long natural keys may take more space in each data block, leading to large tables needing to read more blocks to get the same result.

    Edited by: Brian Bontrager on May 10, 2013 4:34 PM

    Edited by: Brian Bontrager on May 10, 2013 4:35 PM
  • 50. Re: Is using Natural Keys bad?
    Jonathan Lewis Oracle ACE Director
    Currently Being Moderated
    Debbie_1004338a wrote:

    I have an Article table which has a "slug" field and would have values like this...
    be-sure-to-charge-sales-tax.php
    
    when-sole-proprietorships-gets-sued.php
    
    making-sure-that-1099-contractors-withhold-their-own-taxes.php
    
    why-you-should-consider-becoming-an-s-corp.php
    Now, these would make decent PK's, however since this table will eventually hold tens of thousands of Articles, it would be a real PITA to have to always refer to a record by using just the "slug" (It would be much easier to have the option to say, "I want the record where the id=7129" versus "I want the record where slug = 'making-sure-that-1099-contractors-withhold-their-own-taxes.php'?!")
    Two points - it is possible to have multiple unique keys for an entity - usually called candidate keys - and by convention one of these is selected as the primary key. In Oracle any unique key can be the target for a referential integrity constraint, it doesn't have to be a primary key; from a pragmatic point of view there is a critical difference between primary and unique, though, which can result in confusion (and errors) in the way foreign keys are validated - a primary key is automatically NOT NULL, a unique key allows NULL unless you explicitly declare it NOT NULL.

    One of the good (mechanical) reasons for using natural keys is that the appearance of the natural value on the child table allows the user to query the child data without the need to reference the parent table to translate a meaningful value to a meaningless value with a join. If you have keys like "making-sure-that-1099-contractors-withhold-their-own-taxes.php" you're probably going to have to implement a mechanism that allows the user to select a key value from a list of approximate matches anyway (having supplied, e.g. "%1099%contractor%tax%" as a guess) - so an important benefit of the natural key is lost.

    Work out how your application has to use the primary key, and then decide on the strategy. If the natural key at the other end of the many-to-many is similar then I'd be inclined to have a surrogate key at both end and use that as the for both fields in the intersection table, but not introduce another surrogate key.

    Looking at your example, though, I'd take some convincing from the business that the slug was unique - I know that the "slugs" my blog supplier produces for my blog articles are unique - but I also know they're not unique across all the blogs the supplier supports. Might your system grow to support slugs from multiple publications ?

    Update: you might want to look at Oracle's text indexing option (I can never remember the latest name, but it was Context the last time I was using it) to handle searches for the right slug. (Not relevant to MySQL, though).

    Regards
    Jonathan Lewis

    Edited by: Jonathan Lewis on May 11, 2013 7:57 AM
  • 51. Re: Is using Natural Keys bad?
    Debbie_1004338a Newbie
    Currently Being Moderated
    Jonathan Lewis wrote:

    Two points - it is possible to have multiple unique keys for an entity - usually called candidate keys - and by convention one of these is selected as the primary key. In Oracle any unique key can be the target for a referential integrity constraint, it doesn't have to be a primary key;
    That's great to hear. (I'm not sure if it applies to MySQL, but as far as I can tell, it should.)

    from a pragmatic point of view there is a critical difference between primary and unique, though, which can result in confusion (and errors) in the way foreign keys are validated - a primary key is automatically NOT NULL, a unique key allows NULL unless you explicitly declare it NOT NULL.
    Fortunately, phpMyAdmin always defaults to NOT NULL, but point made.

    One of the good (mechanical) reasons for using natural keys is that the appearance of the natural value on the child table allows the user to query the child data without the need to reference the parent table to translate a meaningful value to a meaningless value with a join.
    Exactly.

    If you have keys like "making-sure-that-1099-contractors-withhold-their-own-taxes.php" you're probably going to have to implement a mechanism that allows the user to select a key value from a list of approximate matches anyway (having supplied, e.g. "%1099%contractor%tax%" as a guess) - so an important benefit of the natural key is lost.
    You lost me on this one...

    Work out how your application has to use the primary key, and then decide on the strategy. If the natural key at the other end of the many-to-many is similar then I'd be inclined to have a surrogate key at both end and use that as the for both fields in the intersection table, but not introduce another surrogate key.
    Again, you lost me...


    Short version of a longer story, is that I would have something similar to this...
    SECTION:
    - slug (PK)
    
    SUBSECTION:
    - slug (PK)
    
    ARTICLE
    - slug (PK)
    
    ARTICLE_PLACEMENT
    - section_slug (PK1)(FK)
    - subsection_slug (PK2)(FK)
    - article_slug (PK3)(FK)
    Where ARTICLE_PLACEMENT might look like...
    legal               s-corp               benefits-of-incorporating-your-small-business.php
    Looking at your example, though, I'd take some convincing from the business that the slug was unique - I know that the "slugs" my blog supplier produces for my blog articles are unique - but I also know they're not unique across all the blogs the supplier supports. Might your system grow to support slugs from multiple publications ?
    I am the "Business"! :)

    And, no, this won't be an issue, as long as I don't write an article like "tips-to-help-your-business.php"!! (My website and database will force the author - just me - to always hand-code a unique "Article Slug", and I'm naturally wordy, so I can't foresee "collisions" being an issue. And if there was one, it would never make it past my Article Submission Form...)


    Thanks,


    Debbie
1 2 3 4 Previous Next

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points