1 2 3 Previous Next 116 Replies Latest reply: Mar 3, 2006 5:46 AM by cd_2 RSS

    SIMPLE Database Design Problem !

    488423
      Mapping is a big problem for many complex applications.

      So what happens if we put all the tables into one table called ENTITY?

      I have more than 300 attributeTypes.And there will be lots of null values in the records of that single table as every entityType uses the same table.

      Other than wasting space if I put a clustered index on my entityType coloumn in that table.What kind of performance penalties to I get?

      Definition of the table

      ENTITY
      ---------------
      EntityID > uniqueidentifier
      EntityType > Tells the entityTypeName
      ...
      Name >
      LastName >
      CompanyName > 300 attributeTypes
      OppurtunityPeriod >


      PS:There is also another table called RELATION that points the relations between entities.
        • 1. Re: SIMPLE Database Design Problem !
          cd_2
          Have a look at this thread:

          http://asktom.oracle.com/pls/ask/f?p=4950:8:::::F4950_P8_DISPLAYID:10678084117056

          C.
          • 2. Re: SIMPLE Database Design Problem !
            488423
            hey cd

            the thing is as tom says
            (2)Do not use Generic Data Models
            this domain is not GENERIC...my attributeTypes are hand coded.

            in that post it suggests about a design where there are no NULL values.

            But this design does not sperate the entity from the attribute.
            • 3. Re: SIMPLE Database Design Problem !
              cd_2
              I still don't see where your modell differs from the generic approach ... after all you are encoding (part of) your table definition inside the table itself - and for that part, I'd say that the mentioned problem areas in the thread are still valid.

              C.
              • 4. Re: SIMPLE Database Design Problem !
                488423
                Nope.The thing is TOM stated like this.

                2)Do not use Generic Data Models

                Frequently I see applications built on a generic data model for "maximum
                flexibility" or applications built in ways that prohibit performance. Many times
                - these are one in the same thing! For example, it is well known you can
                represent any object in a database using just four tables:

                Create table objects ( oid int primary key, name varchar2(255) );

                Create table attributes
                ( attrId int primary key, attrName varchar2(255),
                datatype varchar2(25) );

                Create table object_Attributes
                ( oid int, attrId int, value varchar2(4000),
                primary key(oid,attrId) );

                >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

                As we see the attributes and entities are seperated.in my case I do not seperate them in the entity Table all off the attributeTypes are coded in it.

                ENTITY
                EntityID
                EntityType
                FirstName
                LastName
                Gender
                OrganizationName
                Salary
                OpputunityDuration
                FacilityName.............................all the attributeTypes are built into the schema of the entity table so there are no joins to the attribute table as there are no attribute tables.

                Tom stated that generic models perform badly.But this is not a generic model.it is a Concrete model where we have all the attributeTypes in the entity Table.

                So that post u have mentioned is about generic data models.I am asking about a dermalized table.Tom is talking about 4 tables I am talking about 2.He is talking about normilized degin , ı am talking about denormalized design.
                • 5. Re: SIMPLE Database Design Problem !
                  cd_2
                  Sorry, my mistake then.

                  I thought that

                  ENTITY
                  ---------------
                  EntityID > uniqueidentifier
                  EntityType > Tells the entityTypeName

                  was your table definition and

                  Name >
                  LastName >
                  CompanyName > 300 attributeTypes
                  OppurtunityPeriod >

                  an example of your data.

                  Denormalized tables are a different topic, and if you take a look at Tom's page you could find some threads on that matter too.

                  My personal view on denormalization: Can make sense, I don't like 'em myself. Regarding performance, without testing nobody can tell if your denormalized modell performs better than a normalized one, since only you know about the requirements and used data.

                  C.
                  • 6. Re: SIMPLE Database Design Problem !
                    488423
                    I mean there is also a close question to this in the post u have mentioned but nobody answered.
                    in the entityType coloumns we show what kind opf table it is.From imagination is we make it clustered all the table information about the same entityType records will be held together.But there is will be lots of null values.

                    I am thinking to access this table using VIEWS.

                    The performance seems OK but I suspect I might be missing something.it is like the performance is the same even I put all the tables information into the same table.(but not using the same coloumns...)

                    if the thing I did is right . why not many people use this technique?
                    • 7. Re: SIMPLE Database Design Problem !
                      APC
                      What you are talking about is the implementation of sub types (these are business types not to be confused with object implementgaion UDTs). So for instance: STUDENT and TEACHER are both sub types of PERSON. They share some attributes but others are distinct.

                      The two possible implementations are
                      (1) one table, PERSON, with a column PERSON_TYPE to distinguish between the various sub types plus columns to represent all the attributes of each entity.
                      (2) three tables, PERSON to hold the common attributes and STUDENT and TEACHER to hold the sub type specific attributes.

                      The scond (multi-table) option is the nomal approach. The general advice is to take this apporach unless you have a very good reason not to. Personally, I have only implemented a single table once, and whilst it worked (kind of) I think I should have gone for the multiple tables.

                      The alleged advantage of the single table is that it seems to simply management. In practice it does the reverse. The more sub types you have the more difficult it becomes to keep the table stable. Tuning becomes very difficult (the Cost-Based Optimizer doesn't stand a chance) and it's almost impossible to use the table to enforce relational integrity (for instance, you can't build a foreign key to ensure that every course has a TEACHER). On the other hand it's virtually impossible to enforce the arc in the multiple table tactic; we have to really on the application to ensure that every PERSON has a matching record in either STUDENT or TEACHER but not in both.

                      Either approach benefits from using views to represent coherent entities.

                      Cheers, APC
                      • 8. Re: SIMPLE Database Design Problem !
                        cd_2
                        The "problem" is, if you keep all the data in one table, regardless of it's type, you will always have to check for the column entity_type. Can have an impact if you have conditions selecting several columns and want to use indices. Having the option of putting different entities in different tables is there for a reason.

                        It get's even more interesting if you have relationships between your entities - you mentioned the table RELATION in that context. Just make a test with an entity table filled with the expected amount of data and some selects between different entities - and than compare that to a modell where you've split your entities in different tables.
                        if the thing I did is right . why not many people use this technique?
                        Have you ever thought of using constraints in your modell? How would you realize those? And that's only one reason, why your thing may not be "right".

                        C.
                        • 9. Re: SIMPLE Database Design Problem !
                          488423
                          The "problem" is, if you keep all the data in one table, regardless of it's type, you will always have to check for the column entity_type. Can have an impact if you have conditions selecting several columns and want to use indices. Having the option of putting different entities in different tables is there for a reason.

                          >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
                          check the coloumn with WHERE _entityType='PERSON'

                          as there is is clustered index on entityType...there is NO performance decrease.

                          there is also a clustered index on RELATION table on relationType

                          when we say WHERE _entityType ='PERSON' or
                          WHERE relationType='CONTACTMECHANISM'.
                          it scans the clustered index first.it acts like a table as it is physically ordered.
                          >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
                          Have you ever thought of using constraints in your modell? How would you realize those?

                          ...in fact we did.We have arranged the generic object model in an object database.The knowledge information is held in the object database.
                          But the data schema is held in the RDBMS with code generation that creates a schema to hold data.

                          But in able to have a efficent mapping and a good performance we have thought about building only one table.The problem is we know we are losing some space but the thing is harddisk is much cheaper than RAM and CPU.So our trade off concerated on the storage cost.But I still wonder if there is a point that I have missed in terms performance?

                          PS: it is not wise effective using generic object models also in object databases as CPU cost is a lot when u are holding the data.

                          PS2: RDBMS is a value based system where object databases are identity based.we are trying to be in the gray area of both worlds.
                          • 10. Re: SIMPLE Database Design Problem !
                            6363
                            Performance will suffer significantly.
                            as there is is clustered index on entityType...there is NO performance decrease.
                            Cool trick, unfortunately Oracle does not have clustered indexes.
                            We have arranged the generic object model in an object database.
                            Object databases where a bad idea 40 years ago when they were replaced by relational databases and they are still a bad idea now.
                            RDBMS is a value based system where object databases are identity based.we are
                            trying to be in the gray area of both worlds.
                            A big ball of mud is also a grey area

                            (Thanks to APC for that link)
                            • 11. Re: SIMPLE Database Design Problem !
                              cd_2
                              >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
                              check the coloumn with WHERE _entityType='PERSON'

                              as there is is clustered index on entityType...there
                              is NO performance decrease.

                              there is also a clustered index on RELATION table on
                              relationType

                              when we say WHERE _entityType ='PERSON' or
                              WHERE relationType='CONTACTMECHANISM'.
                              it scans the clustered index first.it acts like a
                              table as it is physically ordered.
                              I was thinking in terms of using several conditions in the same select, such as
                              WHERE _entityType ='PERSON'
                                AND LastName LIKE 'A%' 
                              In your case you have to use at least two indices, and since your clustered index comes first ...

                              >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
                              Have you ever thought of using constraints in your
                              modell? How would you realize those?

                              ...in fact we did.We have arranged the generic object
                              model in an object database.The knowledge information
                              is held in the object database.
                              So your relational database is used only as a "simple" storage, everything has go through your object database.
                              But the data schema is held in the RDBMS with code
                              generation that creates a schema to hold data.
                              If you think that this approach makes sense, why not.
                              But in able to have a efficent mapping and a good
                              performance we have thought about building only one
                              table.The problem is we know we are losing some space
                              but the thing is harddisk is much cheaper than RAM
                              and CPU.So our trade off concerated on the storage
                              cost.But I still wonder if there is a point that I
                              have missed in terms performance?
                              Just test your approach by using sufficiently data - only you know how many records you have to store in your modell.
                              PS: it is not wise effective using generic object
                              models also in object databases as CPU cost is a lot
                              when u are holding the data.
                              I don't know if I'd have taken your approach - using two database systems to hold data and business logic.
                              PS2: RDBMS is a value based system where object
                              databases are identity based.we are trying to be in
                              the gray area of both worlds.
                              Like I wrote: if your approach works and scales to the required size, why not? I would assume that you did a load test with your approach.

                              What I would question though is that your discussing a "SIMPLE Database Design" problem. I don't see anything simple in your approach when it comes to implementation.

                              C.
                              • 12. Re: SIMPLE Database Design Problem !
                                488423
                                A big ball of mud is also a grey area
                                >>>>>>>>>>>>>>>>>>>>>>>

                                yes in fact we are using an "Adaptive Object Model" for our object model which is held in an object database.Where Yoder,Biran Foote and Ralph Johnson is the contributers to the idea.

                                Do u have any experience in this?

                                What do u think is we have solved the performance problem and reduced the code size to 1 to 200 ?
                                • 13. Re: SIMPLE Database Design Problem !
                                  6363
                                  I am not sure I understand what you are saying. You asked if there will be performance problems (there will), and now you say you have solved them. That seems pretty quick.

                                  I am still interested in how you intend to create a clustered index though.
                                  • 14. Re: SIMPLE Database Design Problem !
                                    cd_2
                                    What do u think is we have solved the performance
                                    problem and reduced the code size to 1 to 200 ?
                                    Congratulations then, you found your "perfect" solution. Just one question: if everything works, why do you have still doubts in your concept?

                                    C.
                                    1 2 3 Previous Next