This discussion is archived
1 2 3 Previous Next 38 Replies Latest reply: Sep 24, 2013 3:52 PM by jgarry RSS

Determining Keys for Junction Table

Debbie_1004338a Newbie
Currently Being Moderated

I would like to get some advice on how I currently define the keys for my Junction Tables.  (*NOTE:  I am doing this in MySQL, but my questions should apply to any database environment including Oracle.)

 

Based on some discussions and advice from this past summer, whenever I create a new table I try and follow this approach...

 

- Create an ID field

- Set the ID to AutoIncrement and make it a UK

- Look for some physical attribute which naturally describes the record and choose that as the PK (e.g. Username, Email, etc)

 

This allows me to always have a Unique Key for every record via the AutoIncrement.

 

It also helps to avoid this problem...

ID    User

---    -----

1    DoubleDee

2    DoubleDee

3    DoubleDee

4    DoubleDee

 

 

This approach seems to be working very well.

 

Now on to my specific question...

 

I have the following tables...

 

member

- id (UK)

- username (PK)

 

article

- id (UK)

- slug (PK)

 

article_thread

- id (UK)

- article_id (PK1)(FK)

- member_id (PK2)(FK)

- subscribed_on (PK3)

 

Is there any problem creating a Composite Primary Key on the 3 fields above?

 

 

A lot of people would just do this...

article_thread

- id (PK)

- article_id (FK)

- member_id (FK)

- subscribed_on

 

Similar to what I said in my introduction, I don't like the second approach, because it doesn't ensure uniqueness from a PHYSICAL standpoint.  (Maybe less true since I am using the Timestamp.)

 

In my mind, by having an ID that is a UK, I have an easy "hook" to refer to my Junction Table for other joins.  But I also have the benefit of ensuring that there is ONLY one Article + Member + Subscribed On combination in my Junction Table.

 

Am I putting undo stress on my database taking this approach?

 

MySQL seems fine with things in a test environment, but if I ever have 10,000 or 100,000 or one million records, then maybe I would regret this architectural decision... 

 

What do you gurus think?

 

Sincerely,

 

 

Debbie

"Trying to be cool like Oracle people!!"

  • 1. Re: Determining Keys for Junction Table
    JohnWatson Guru
    Currently Being Moderated

    Would this be a college homework assignment?

    I think you may not appreciate the difference between a surrogate key and a natural key, and what the purpose of each is. I think you are also using "unique key" and "primary key" incorrectly. Are you aware that  unique key can be NULL, unlike a primary key?

  • 2. Re: Determining Keys for Junction Table
    Debbie_1004338a Newbie
    Currently Being Moderated

    JohnWatson wrote:

     

    Would this be a college homework assignment?

    I think you may not appreciate the difference between a surrogate key and a natural key, and what the purpose of each is. I think you are also using "unique key" and "primary key" incorrectly. Are you aware that  unique key can be NULL, unlike a primary key?

     

    I spent 20 minutes composing that post and you liken it and me to a college student????  (Give me a little more credit...)  *rolls eyes*

     

    No, man, this is for a website that I am building in my free time.

     

    Maybe I don't know the true difference between a Unique Key and a Primary Key.  (Other than you can obviously only have ONE Primary Key!!)

     

    And, no, I did not know that a Unique Key can be NULL.

     

    Even if it can be NULL, I have member.id and article.id set as AutoIncrement and Null = No, so that would never happen.

     

     

    *****

    The approach I have been taking since talking to people on here this past summer is this...  (typically)

     

    1.) Create an ID field and make it AutoIncrement, Null = No.  Make that a Unique (Surrogate) Key.

    2.) Find a field which is a "natural" key and - in theory - will never change or be a duplicate.  Make that the Primary Key.

     

    In my opinion, Article Slug is a good PK because it is unique and it should never change.

     

    Member Username is maybe more debatable, and you could argue that making Member ID the PK is better, since you might have to change a Member Username (e.g. case of type-o during registration).  But that is a separate debate.

     

    I think my use of Primary and Surrogate Keys above is on track.  If you disagree, then please explain why.

     

    Sincerely,

     

     

    Debbie

     

    Message was edited by: Debbie_1004338a

  • 3. Re: Determining Keys for Junction Table
    jgarry Guru
    Currently Being Moderated

    It looks like you are trying to get the benefits of both natural and surrogate keys, and winding up losing both.  You have to worry about updating the primary key, and worry about the synchronization of the artificial key.

     

    You can find many long arguments about the benefits of each, so I won't go there.  For foreign keys in other tables with a one to many relationship, you ought to be storing the primary key because it is a primary key.

     

    In the end, there are differences between the physical implementations, that's why you do a proper logical normalization first, and make modifications for implementation specifics when implementing.

  • 4. Re: Determining Keys for Junction Table
    Debbie_1004338a Newbie
    Currently Being Moderated

    jgarry wrote:

     

    It looks like you are trying to get the benefits of both natural and surrogate keys, and winding up losing both.  You have to worry about updating the primary key, and worry about the synchronization of the artificial key.

     

    You can find many long arguments about the benefits of each, so I won't go there.  For foreign keys in other tables with a one to many relationship, you ought to be storing the primary key because it is a primary key.

     

    In the end, there are differences between the physical implementations, that's why you do a proper logical normalization first, and make modifications for implementation specifics when implementing.

     

    I'm not following you.

     

    There is no updating the Primary Key or having to synchronize the Artificial Key.  (That is why I use AutoIncrement for all of my ID fields.)

     

    So if I made all of my ID's be the Primary Key, would that satisfy your concerns?

     

    This entire topic can lead a person to going in an infinite loop, which is frustrating!

     

    Sincerely,

     

     

    Debbie

  • 5. Re: Determining Keys for Junction Table
    jgarry Guru
    Currently Being Moderated

    It doesn't really concern me.  If you choose wrong, you will learn something, possibly affecting your religious beliefs for the future.  But more likely, it's a tossup.  It's the maintenance down the road where small decisions have large consequences.

  • 6. Re: Determining Keys for Junction Table
    Debbie_1004338a Newbie
    Currently Being Moderated

    jgarry wrote:

     

    It doesn't really concern me.  If you choose wrong, you will learn something, possibly affecting your religious beliefs for the future.  But more likely, it's a tossup.  It's the maintenance down the road where small decisions have large consequences.

     

    So please help me choose a strategy that offers the most.

     

    Personally, I'm not a fan of just creating an "ID" field for every table, making it AutoIncrement (read "Sequence" - I believe - in Oracle-speak), as well as making it a Primary Key.

     

    Why?

     

    Because of this...

     

    ID    User

    ---    -----

    1    DoubleDee

    2    DoubleDee

    3    DoubleDee

    4    DoubleDee

     

     

    Or in the case of my Article Thread table, something like this...

    id    article_id    member_id    subscribed_on

    ---    -----------    ----------    --------------

    1    250        19        2013-09-19

    2    250        19        2013-09-19

    3    250        19        2013-09-19

    4    250        19        2013-09-19

     

     

    I guess I was under the impression that you should put an Index/Constraint - one in the same in MySQL - on the Foreign Key fields in a Junction Table for the reason shown above.

     

    (Feel free to jump in here, gurus...)

     

    If I flipped the paradigm around like this, would it be better in your mind...

    article_thread

    - id (PK)

    - article_id (UK1)(FK)

    - member_id (UK2)(FK)

    - subscribed_on (UK3)

     

     

    The more I think about all of this, the more confusing it becomes.  (One reason I figured I'd ask some Oracle Experts how they'd tackle things...)

     

    Sincerely,

     

     

    Debbie

  • 7. Re: Determining Keys for Junction Table
    Debbie_1004338a Newbie
    Currently Being Moderated

    Based on my research, if you set a field in MySQL to Null = No, then a Unique Key/Index cannot contain any NULL records, and thus is basically behaves like a Primary Key.

     

    As such, I can't find any reasons why the approach I described above could be considered "flawed".

     

    And having both a Surrogate Key and a Natural Key on tables provides several benefits.

     

    Also, based on my research, I cannot find any reasons why doing joins between tables using a Unique Key/Index and a Foreign Key would cause any undo harm.  (Sure, it is better to do joins between PK and FK, but sometimes because of how fields are defined, it is better for performance reasons to use the Surrogate UK versus the Natural PK.

     

    Not sure where all of my naysayers went...

     

    (It would be nice to get answers to my questions above, as well as some clarification on the strong comments posted by others...)

     

    Sincerely,

     

     

    Debbie

     

     

    Message was edited by: Debbie_1004338a

  • 8. Re: Determining Keys for Junction Table
    JohnWatson Guru
    Currently Being Moderated

    Your research will have taught you the meaning of the terms "surrogate key", "natural key", "primary key", and "unique key". Your design includes a surrogate key for each table,  which is pointless, because all your data integrity rules are based on natural keys.

  • 9. Re: Determining Keys for Junction Table
    Debbie_1004338a Newbie
    Currently Being Moderated

    JohnWatson wrote:

     

    Your research will have taught you the meaning of the terms "surrogate key", "natural key", "primary key", and "unique key". Your design includes a surrogate key for each table,  which is pointless, because all your data integrity rules are based on natural keys.

     

    Surrogate Keys are pointless?  (That should win you the ire of half the database world.)

     

    If you read my original post, you'll see that I am not using Surrogate Keys for "data integrity" in most cases...

     

     

    Debbie

  • 10. Re: Determining Keys for Junction Table
    rp0428 Guru
    Currently Being Moderated
    Surrogate Keys are pointless?  (That should win you the ire of half the database world.)

    When you think that one of the 'gurus' trying to help you has said something that would win them 'the ire of half the database world' your first thought should be that you misunderstood what they said. Your second thought should be to reread their entire reply and see if you can find a different interpretation of what they said.

     

    John did NOT say that surrogate keys are pointless. He said YOUR DESIGN is pointless because it includes a surrogate key for each table when your rules are based on natural keys.

    If you read my original post, you'll see that I am not using Surrogate Keys for "data integrity" in most cases...

    And another indication that you are either misreading or misinterpreting what John said because he said the SAME THING you just did:

    all your data integrity rules are based on natural keys.

    In short John and the others are saying if you are going to use natural keys then USE THEM and don't add superfluous surrogate keys which add NOTHING to the architecture but rather make it more complicated and harder to maintain.

    I have the following tables...

     

    member

    - id (UK)

    - username (PK)

     

    article

    - id (UK)

    - slug (PK)

     

    article_thread

    - id (UK)

    - article_id (PK1)(FK)

    - member_id (PK2)(FK)

    - subscribed_on (PK3)

    In each table above the PK is on natural keys. Therefore each id column is redundant and serves no useful purpose. That is the point everyone is trying to make in spite of your assertion to the contrary:

    And having both a Surrogate Key and a Natural Key on tables provides several benefits.

    And I'm not sure why you believe this statement that you made:

    There is no updating the Primary Key or having to synchronize the Artificial Key.

    Any meaningful piece of 'business data' is subject to change if for no other reason than it was entered incorrectly and needs to be updated to the 'correct' value.

     

    Any value that is used to join two tables BY DEFINITION has to be synchronized between the two tables. If the value in one of the joined tables changes the value in the other table must be changed SYNCHROUSLY to maintain proper data integrity.

     

    So when the 'values' used to join two tables are 'business data' and that 'business data' value has to be changed you have a problem: the contraint between the two tables gets in the way of performing the SYNCHROUS change of the value in both tables.

     

    Surrogate keys are BY DEFINITION not meaningful pieces of data. So when they are used as primary keys, foreign keys and join columns they do NOT manifest the SYNCHRONIZATION problem because you NEVER need to change their value since, BY DEFINITION, their actual value has NO meaning.

     

    And if you elect to use data values for primary keys, foreign keys and join columns then no purpose is served by adding surrogate keys.

  • 11. Re: Determining Keys for Junction Table
    Debbie_1004338a Newbie
    Currently Being Moderated

    rp0428 wrote:

     

    Surrogate Keys are pointless?  (That should win you the ire of half the database world.)

    When you think that one of the 'gurus' trying to help you has said something that would win them 'the ire of half the database world' your first thought should be that you misunderstood what they said. Your second thought should be to reread their entire reply and see if you can find a different interpretation of what they said.

     

    John did NOT say that surrogate keys are pointless. He said YOUR DESIGN is pointless...

     

     

    A lot of hostility in this thread from the first response.

     

    We went from accusations that I'm getting help on my homework to now being told my design is pointless.

     

    Why so extreme?

     

    I'll gladly break down my OP into more specific questions if that helps.  (Although I thought it was pretty to the point.)

     

    But if people think I'm an idiot, then why respond?

     

    Sincerely,

     

     

    Debbie

  • 12. Re: Determining Keys for Junction Table
    rp0428 Guru
    Currently Being Moderated

    A lot of hostility in this thread from the first response.

     

    We went from accusations that I'm getting help on my homework to now being told my design is pointless.

     

    Why so extreme?

    There you go again misinterpreting what people are saying. No one is being 'hostile' and no one is being 'extreme'. But we are ALL being as precise as we can be with our comments about your design.

     

    You have to read the ENTIRE response IN CONTEXT. You can't just select one word out of a response and make it your focus.

     

    I, of course, cannot speak for John but I have tried to tell you what I believe he meant.

    He said YOUR DESIGN is pointless because it includes a surrogate key for each table when your rules are based on natural keys.

    And this is what I said and is DEFINITELY what I meant:

    In each table above the PK is on natural keys. Therefore each id column is redundant and serves no useful purpose. That is the point everyone is trying to make in spite of your assertion to the contrary:

    You don't need both surrogate and natural keys. Use one or the other but not both.

     

    Because you are using both natural keys and surrogate keys the surrogate keys (the id columns) are redundant. That design (at least that element of the design, which is what is being discussed), as John said, is pointless (which means it serves no point) or, as I said is redundant and serves no useful purpose.

     

    This is what you ask for:

    I would like to get some advice on how I currently define the keys for my Junction Tables

    And that is EXACTLY what you got.

  • 13. Re: Determining Keys for Junction Table
    Debbie_1004338a Newbie
    Currently Being Moderated

    > There you go again misinterpreting what people are saying.

    > No one is being 'hostile' and no one is being 'extreme'.

    > But we are ALL being as precise as we can be with our comments about your design.

     

    It's the beauty of communication online.  What the sender says is not always what the receiver interprets...

     

     

    > In each table above the PK is on natural keys.

    > Therefore each id column is redundant and serves no useful purpose.

    > That is the point everyone is trying to make in spite of your assertion to the contrary:

    > You don't need both surrogate and natural keys. Use one or the other but not both.

     

    Except for these considerations...

     

    1.) I like to have a numeric ID for ever record.

     

    Put yourself in my non-Oracle, non-enterprise situation a second...

     

    Which table would you rather visually try and find a record in this table...

    some natural key

        -----------------

        x3hdj59fnhjs34

        s203nc49mcoc95

        #ndm495jnri9j4

        9qjcdch84894un

        !1fnisndj9dndn

     

     

     

     

    Or in this one with a Surrogate Key...

    id    some natural key

    ---    -----------------

    1    x3hdj59fnhjs34

    2    s203nc49mcoc95

    3    #ndm495jnri9j4

    4    9qjcdch84894un

    5    !1fnisndj9dndn

     

     

     

    2.) If I always have an ID field using AutoIncrement, Not Null, then that gives me a backup if I decide I want to join using a Surrogate Key versus a long clunky Natural Key.

     

    3.) Having an AutoIncrement Surrogate Key forces the records in my table to always sort in the order they were inserted.

     

    Without an ID, I could insert records like this...

    aaa

    zzz

    fff

    www

    iii

    bbb

     

    But MySQL will display them like this since they are sorted by the Primary Key...

    aaa

    bbb

    fff

    iii

    www

    zzz

     

    (Yeah, I would likely have a created_on field to recreate things, but its so much nicer to not have the database screw with how I physically inserted things.)

     

     

    4.) I might decide to make a field the Primary Key (e.g. "username"), and then with time realize that it changes too much and want to go with a Surrogate Key.

     

    If I create my ID, AutoIncrement, Not Null Surrogate Key up front, if I ever need/want to do such a switch, it is a piece of cake!!

     

     

    Based on my research and design, my Surrogate Keys should behave just like a Primary Key, they offer many benefits to me - as described above - and the overhead is next to nothing.

     

    So why not have them?

     

     

    Are the "redundant"?  I suppose they are.  (I never said differently.)

     

    But do they cause a problem with Data Integrity or Maintenance?  No.

     

     

    > Because you are using both natural keys and surrogate keys the surrogate keys (the id columns) are redundant.

    > That design (at least that element of the design, which is what is being discussed), as John said, is pointless

    > (which means it serves no point) or, as I said is redundant and serves no useful purpose.

     

    Yeah, you make it sound like I am killing puppies.

     

    Is it really that black-and-white to you?

     

    Don't you think I have some valid points above?

     

    BTW, it seems to me that based on some conversations in THIS FORUM this past summer, I came up with the above approach.  So some people out there must not think its so crazy?!

     

    And I know I have had some fairly experienced non-Oracle database people who seem to think my approach has some merits.

     

    I'm not looking to win any awards, but I do want something that is both *solid* as well as *flexible*, and I thought the design I described in my OP was the best of both worlds...

     

    Sincerely,

     

     

    Debbie

  • 14. Re: Determining Keys for Junction Table
    Hemant K Chitale Oracle ACE
    Currently Being Moderated

    One uses Surrogate Keys  (your "ID" keys) only when one cannot find a Natural Key in the data.  You do have natural keys (username, slug) for the first two tables.  Ergo, you don't need the surrogate keys for the first two tables.  But you likely need it for the article_thread table.

    Furthermore, any foreign keys should be referencing these primary key columns. That means that the FK definition for the article_thread table is wrong.

     

    Hemant K Chitale

1 2 3 Previous Next

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points