1 2 3 4 Previous Next 51 Replies Latest reply: May 11, 2013 1:02 PM by Debbie_1004338a RSS

    Is using Natural Keys bad?

    Debbie_1004338a
      Hi there. First time poster, and fairly new to the database world.

      Up front, I would like to say that I am working with MySQL, but since I consider Oracle people to be some of the smartest, I figured I'd join and try to learn from the pros!!

      I would like to know if it is "bad" to use Natural/Physical Keys as the Primary Key in a table?

      How does this affect performance?

      I have some tables which are basically "look-up" tables, and it would be much easier for me to work with Natural Keys, but I don't want to choose something that is easier for me, that would mess up my database.

      Sincerely,


      Debbie
        • 1. Re: Is using Natural Keys bad?
          994218
          Could you please redefine your problem ?

          Thanks,
          http://www.dbas-oracle.com/
          • 2. Re: Is using Natural Keys bad?
            Brian Bontrager
            Welcome to the forum!

            It depends somewhat on your data model, with a dash of personal preference. If your natural key will not change and is fairly compact, it can be every bit as good as a derived primary key (such as a number from a sequence).

            In some cases performance can be improved by using the natural key as the primary. You would not need an "extra" index on the natural key in that case... fewer indexes can help insert/update performance.
            • 3. Re: Is using Natural Keys bad?
              Iordan Iotzov
              Welcome to the forums!

              The answer, as usual, is “it depends”.

              I think there are two major aspects of the surrogate key/natural key debate – “flexibility” and performance.

              A “flexible” database design is one that can relatively easily accommodate changes and new requirements. Since changing a primary key is typically a difficult thing to do, using surrogate keys for a primary key give “flexibility” to a system. The impact of a natural key change on the DB design would be significantly smaller is surrogate keys are used.

              Generally speaking, using surrogate keys would require more disk space and it would take some, usually small, toll on performance. The performance impact can vary though. If the tables are implemented as Oracle Index-Organized tables or as MySQL tables (not 100% sure for that), the primary key is stored in all indexes. If the surrogate key is smaller than the natural one, then the surrogate key can reduce storage footprint and even improve performance for some queries.

              Iordan Iotzov
              • 4. Re: Is using Natural Keys bad?
                Debbie_1004338a
                Brian Bontrager wrote:
                Welcome to the forum!
                Thanks!
                It depends somewhat on your data model, with a dash of personal preference. If your natural key will not change and is fairly compact, it can be every bit as good as a derived primary key (such as a number from a sequence).

                In some cases performance can be improved by using the natural key as the primary. You would not need an "extra" index on the natural key in that case... fewer indexes can help insert/update performance.
                Well, maybe I can share more of what I am working on? (BTW, I have lots of other questions, but I'm trying to ask for help with little "bits" of the bigger problem.)

                I have a "Section" and a "Dimension" table which form a many-to-many relationship.

                If I wanted to make the junction table user and query friendly, I would make it look like this...

                SECTION_DIMENSION table

                Section          Dimension
                Finance          Business Structure
                Finance          Store Type
                Finance          Offering
                Finance          Accounting
                Legal          Business Structure
                Legal          Store Type
                Legal          Offering
                Legal          Featured Legal

                If I use Natural Keys like above, then I could use the SECTION_DIMENSION directly and not have to create a complex join.

                It would also be much easier for other issues I need help with. Specifically, it would allow me to "read" one table and see how different "Entities" relate...

                Does that help answer what I am working on?

                Sincerely,


                Debbie

                P.S. BTW, isn't there anything here in the forums to preserve FORMATTING and TABS?? :(
                • 5. Re: Is using Natural Keys bad?
                  rp0428
                  Welcome to the forum!
                  >
                  I would like to know if it is "bad" to use Natural/Physical Keys as the Primary Key in a table?

                  How does this affect performance?
                  >
                  Performance isn't an issue. The issues are mostly related to maintenance of the data, although security can be an issue also.

                  The purpose of a primary key value is to ensure uniqueness and, often, to provide referential integrity between parent and child tables. That means there is typically a FOREIGN KEY relationship between parent and child using the primary key.

                  The maintenance problems arise when a key value needs to be changed. That means the value has to be changed in ALL tables, parent and child, that are affected. Except you can't change the parent key value if child rows exist ; those child rows MUST point to a parent.

                  If the value used for the key is an actual 'data' value (i.e. a natural key such as employee_id or social security number) a user could make a mistake entering the data. That mistake means that the value MUST be changed..

                  To change a primary key value you must:

                  1. first create a new parent row with the new value
                  2. change the child rows to use the new value
                  3. drop the original parent row

                  Or you need to disable the foreign key constraint, change both tables and then reenable the constraint.

                  Either of those options can be difficult and cause problems in a multi-user database.

                  And if you ever restore data from a backup you may restore the OLD data values. Or you may not be able to restore just the parent data if it has the old key value.

                  A security issue arises for key values based on a natural key such as employee_id or social security number. Now you are duplicating that sensitive information in multiple tables and need to protect it.

                  A surrogate key, such as a sequence number, has NO business meaning. It isn't entered by the user, doesn't matter to the user what the value is and never needs to be changed. Also, it isn't a real data item so there are no real security issues.
                  >
                  I have some tables which are basically "look-up" tables, and it would be much easier for me to work with Natural Keys, but I don't want to choose something that is easier for me, that would mess up my database.
                  >
                  Some organizations use a more natural key for those "look-up" tables. If the data value is unlikely to change you won't have the issues mentioned above.

                  In the U.S. a lookup table for 'STATES' will often use the two letter state abbreviation as the primary key. The number of states is 50 and is not likely to change. The abbreviation is not likely to change either. So using the abbreviation is faily safe.

                  From a pure architectural viewpoint a surrogate key would still be the preferred choice.

                  You have to make the 'right' decision for your organization. All architecture and implementation decisions should be documented along with the reasons they were made. That means listing the pros and cons (the possible risks) and then the reason for the choice that was made.
                  • 6. Re: Is using Natural Keys bad?
                    Marcus Rangel
                    Hello,

                    I have worked with both approaches and I believe surrogate keys present more advantages. Here is my pros and cons list. It was written in portuguese a while ago and translated in 5 minutes just now, so please overlook eventual mistakes... :)

                    Pros:
                    - Surrogate keys allow business changes with less impact. This is probably the most important argument. Natural keys are always subject to change. Companies merge, the government adds another digit, the law changes. Always be suspicious of the analyst who says "this field will never change." I've had to change this information (policy number) that was the PK of a 600,000 records table, with more than 100 "descendants" distributed in 11 levels of dependency. Believe me, it was no fun.
                    - Surrogate keys result in fewer columns in the tables. The choice of natural keys is always accompanied by the use of composite keys, which leads to information redundancy and waste of space.
                    - Updates do not occur in the fields of the primary key, so there is less chance of problematic locks. Incidentally, updates on the primary key fields violate the relational model common sense and should not happen anyway.
                    - Because the keys are composed of just one field, usually a number, the joins are simpler (for the developer and for the optimizer) and faster.
                    - The use of single-column keys facilitates the calculations for the cost-based optimizer (which by the way can only accurately handle multiple columns cardinality in Oracle 11g release 2 - extended statistics). This results in better decisions, better execution plans and thus better performance of the system as a whole.

                    Cons:
                    - More join operations are needed to assemble the information to the end user. This can be solved in various ways, being "snapshots" or "materialized views" the most popular option today.
                    - One more constraint is needed ("unique" in the fields of the natural key).
                    - Part of business integrity ceases to be validated by FKs. This is the highest point of attention, because the database will not detect anomalies such as "order from one customer containing items from another customer". In practice, it is rare for stable applications to create such anomalies, but load processes and migrations can easily do it.

                    Edited by: Marcus Rangel on May 7, 2013 2:15 PM
                    • 7. Re: Is using Natural Keys bad?
                      Debbie_1004338a
                      Iordan Iotzov wrote:
                      Welcome to the forums!
                      Thank you!

                      The answer, as usual, is “it depends”.
                      I figured!

                      I think there are two major aspects of the surrogate key/natural key debate – “flexibility” and performance.

                      A “flexible” database design is one that can relatively easily accommodate changes and new requirements. Since changing a primary key is typically a difficult thing to do, using surrogate keys for a primary key give “flexibility” to a system. The impact of a natural key change on the DB design would be significantly smaller is surrogate keys are used.

                      Generally speaking, using surrogate keys would require more disk space and it would take some, usually small, toll on performance. The performance impact can vary though. If the tables are implemented as Oracle Index-Organized tables or as MySQL tables (not 100% sure for that), the primary key is stored in all indexes. If the surrogate key is smaller than the natural one, then the surrogate key can reduce storage footprint and even improve performance for some queries.
                      Well, not to get too far ahead, but let me share even more than what I just did responding to Brian...

                      I am familiar with what you are alluding to, and so to "have my cake and eat it too", I started off with this table approach...

                      Key to Debbie's Lingo
                      PK = Primary Key
                      UK = Unique Key
                      FK = Foreign Key
                      UK1 = Partial Unique Key
                      IDX = Index


                      **NOTE:
                      1.) MySQL requires that you have Indexes on any fields that act as Foreign Keys.
                      2.) I believe a "Primary Key" and "Foreign Key" are technically "Constraints", but in MySQL they are called "Indexes" and really a "Constraint" + "Index" in one!!



                      SECTION table
                      - id (PK)
                      - slug (UK)
                      - name (UK)
                      - created_on
                      - updated_on


                      DIMENSION table
                      - id (PK)
                      - slug (UK)
                      - name (UK)
                      - created_on
                      - updated_on


                      SECTION_DIMENSION junction
                      - id (PK?)
                      - section_slug (UK1)(IDX)
                      - dimension_slug (UK2)(IDX)
                      - created_on
                      - updated_on


                      So what I am trying to do above is "have it both ways". On one hand, I am creating your standard AutoNumber, Integer, Primary Key on every Table.

                      But I also have a "Natural Key" (i.e. Slug) using a Unique Index which is understandable in English, and which makes my Junction Table much easier to read!!

                      Some people would say that in SECTION_DIMENSION I could drop the "id" and just make "section_slug" + "dimension_slug" my PK. (I always debate over this, since I was taught that all Tables must have an AutoIndex PK...)

                      Lots of information, but hopefully that helps you guys to better understand what I am confused about, and what I am trying to do?!

                      Sincerely,


                      Debbie
                      • 8. Re: Is using Natural Keys bad?
                        EdStevens
                        991215 wrote:
                        Could you please redefine your problem ?

                        Thanks,
                        http://www.dbas-oracle.com/
                        You are walking a very fine line here. A lot of us have blogs that we use as a quick reference for what are really FAQ questions. When I link to mine, it is readily apparent that I am referencing my own work, and it is strictly non-commercial. In your case, the site is festooned with commercial content, and your posting history is devoid of any real content beyond the link to your decidedly commercial blog. Not to mention that some of your content looks suspiciously like MOS content, which, if true, puts you in violation of your support agreement.
                        • 9. Re: Is using Natural Keys bad?
                          EdStevens
                          1004338 wrote:
                          <snip>>
                          P.S. BTW, isn't there anything here in the forums to preserve FORMATTING and TABS?? :(
                          There is, but unlike some other fora, there is not an icon to click on to get it. The trick is what we refer to as 'code tags' They look like this: \
                          Put one on the line just before the text you want to preserve the formatting, and one just after.  As you are creating your message, it will look like this:
                          
                          \
                          sql> select sysdate from dual;
                          \
                          And when you post it it will look like this:
                          sql> select sysdate from dual;
                          You can use the "Preview" tab to see what a posting will look like before you pull the trigger.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                
                          • 10. Re: Is using Natural Keys bad?
                            EdStevens
                            1004338 wrote:
                            Hi there. First time poster, and fairly new to the database world.

                            Up front, I would like to say that I am working with MySQL, but since I consider Oracle people to be some of the smartest, I figured I'd join and try to learn from the pros!!

                            I would like to know if it is "bad" to use Natural/Physical Keys as the Primary Key in a table?

                            How does this affect performance?

                            I have some tables which are basically "look-up" tables, and it would be much easier for me to work with Natural Keys, but I don't want to choose something that is easier for me, that would mess up my database.

                            Sincerely,


                            Debbie
                            As others have stated, there are two schools of thought and a healthy does of 'it depends'. I've come to a more doctrinaire position in favor of synthetic PK, derived from a sequence. You'd be surprised at the natural data elements that appear to be "unchangeable" -- that are changeable. Even a Social Security Number. There are some elements of data that truly are cast in steel (literally -- like an automobile's VIN), but they are few and far between.

                            I think if I had a single-column look-up table -- just providing a somewhat dynamic list of valid values -- I might consider foregoing a separate PK column. But others may point up a fallacy there as well.
                            • 11. Re: Is using Natural Keys bad?
                              rp0428
                              >
                              But I also have a "Natural Key" (i.e. Slug) using a Unique Index which is understandable in English, and which makes my Junction Table much easier to read!!

                              Some people would say that in SECTION_DIMENSION I could drop the "id" and just make "section_slug" + "dimension_slug" my PK. (I always debate over this, since I was taught that all Tables must have an AutoIndex PK...)
                              >
                              That gets back to what I said earlier. If those 'slug' values (ugh! I hate slugs - step on them every chance I get) are subject to change then you are going to have those maintenance problems I mentioned except now, because of the intersect table, you will have a three table version of the problem.

                              For an OLAP or datawarehouse system data is often denormalized to improve query performance. But for data where the key values might change, even lookup data, your UPDATE process for that datawarehouse will get more complicated.

                              Sometimes those lookup tables need to be rebuilt after the main data is loaded. That might not be a problem for a datawarehouse but for OLAP it usually isn't an option.
                              • 12. Re: Is using Natural Keys bad?
                                jay1980
                                One of the other considerations in an OLTP environment is if the application (JEE based) is using an ORM it is easier to work with surrogate keys getting populated through the usage of sequences.

                                Thanks,
                                Jayadeep
                                • 13. Re: Is using Natural Keys bad?
                                  jgarry
                                  >
                                  As others have stated, there are two schools of thought and a healthy does of 'it depends'. I've come to a more doctrinaire position in favor of synthetic PK, derived from a sequence. You'd be surprised at the natural data elements that appear to be "unchangeable" -- that are changeable. Even a Social Security Number. There are some elements of data that truly are cast in steel (literally -- like an automobile's VIN), but they are few and far between.
                                  But of course, google "VIN cloning." People trust the bar code on the dash without looking at the stampings. And this is enlightening: http://www.hemmings.com/mus/stories/2006/07/01/hmn_feature16.html?t=printable Often with motor vehicle departments, all they care about is the column being filled, they just have you sign a statement and buy a bond to cover their butts for anything weird, like the common use of frame pieces to register hot rods.

                                  >
                                  I think if I had a single-column look-up table -- just providing a somewhat dynamic list of valid values -- I might consider foregoing a separate PK column. But others may point up a fallacy there as well.
                                  Probably the determining factor is what else has been done in an existing system.
                                  • 14. Re: Is using Natural Keys bad?
                                    Debbie_1004338a
                                    Thanks for all of the responses, but I am not exactly picking up what people thought of the examples I provided... :-/

                                    I thought I'd get some point for this design, which is sorta the best of both worlds...

                                    SECTION table
                                    - id (PK)
                                    - slug (UK)
                                    - name (UK)
                                    - created_on
                                    - updated_on


                                    Doesn't that follow what most of you are saying?


                                    And what about my junction table...

                                    SECTION_DIMENSION junction
                                    - id (PK)
                                    - section_slug (UK1) *(FK1)*
                                    - dimension_slug (UK2) *(FK2)*
                                    - created_on
                                    - updated_on


                                    Again, I wasn't sure since people were responding in more general terms, but it sort of sounded like at least one person thought that having section_slug and dimension_slug as the Foreign Keys would be a maintenance issue, but I was unclear on what everyone thinks?


                                    (BTW, in MySQL, there is the concept of CASCADE ON UPDATE which would make it easy to change a PK and the corresponding FK's - less issues like backups mentioned earlier.)

                                    Sincerely,


                                    Debbie

                                    Edited by: 1004338 on May 6, 2013 11:11 AM
                                    1 2 3 4 Previous Next