This discussion is archived
1 2 Previous Next 16 Replies Latest reply: Apr 15, 2013 10:46 PM by Jan-Marten Spit RSS

Order of columsn in table - how important from performance perspective

user5716448 Explorer
Currently Being Moderated
Hi ,

Using oracle 11.2.0.3 and it has been suggested to us that more efficient to have most freqeuntly used columns at start of table as use less cpu cycles.

Currently the keys of the table are listed first but as existing live tables wonder about implicatuions if we tried to rearraneg ordr of columsn in table

1) How big is the performance gain really?

2) Can you easily rearrange order of columns in table or do you have to drop and recreate the table?

Thanks
  • 1. Re: Order of columsn in table - how important from performance perspective
    Justin Cave Oracle ACE
    Currently Being Moderated
    It's extremely unlikely that the order of columns in a table will matter from a performance perspective. If you have a table with more than 255 columns or if have a large number of columns that are often NULL, the order of columns may have some impact on performance. But for the vast majority of situations, it's not relevant.

    There is no easy way to change the order of columns in a table. You'd need to drop and recreate it (or use DBMS_REEFINITION which, behind the scenes, is doing just that).

    Justin
  • 2. Re: Order of columsn in table - how important from performance perspective
    EdStevens Guru
    Currently Being Moderated
    user5716448 wrote:
    Hi ,

    Using oracle 11.2.0.3 and it has been suggested to us that more efficient to have most freqeuntly used columns at start of table as use less cpu cycles.

    Currently the keys of the table are listed first but as existing live tables wonder about implicatuions if we tried to rearraneg ordr of columsn in table

    1) How big is the performance gain really?
    I can't imagine any performance gain at all, unless you have chained rows. And unless you have individual rows whose length exceeds the size of a block, you shouldn't have chained rows. (yes, you can have chained rows but that is addressed in a different manner). So, if the smallest unit of I/O is a block, and an entire row is on that block, how are cpu cycles saved by having "more frequently used columns" at a lower offset in the block (and remember, that is a lower offset only in relation to the other columns of that particular row) ?
    >
    2) Can you easily rearrange order of columns in table or do you have to drop and recreate the table?

    Thanks
  • 3. Re: Order of columsn in table - how important from performance perspective
    Hoek Guru
    Currently Being Moderated
    You should consider the order of columns in an index and not worry about the order of columns in a table.
    http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:5671539468597
  • 4. Re: Order of columsn in table - how important from performance perspective
    user5716448 Explorer
    Currently Being Moderated
    Hi,


    Thanks for replies.

    It is a data warehouse and have a large number of columns that are often NULL since it is a dimension with 3 levels in the hierarchy - these are at start of table after the key columns.

    Order of columns

    Table has pk at start (dimension_key)
    all columns for lowest level of dimension (30 columns) these can be null
    then 20 columns which relate to next level in the dimension - again can be null
    then 10 columns for highest level of dimension - these always populated.

    Oracle Warehouse builder tool creates the tables based upon the dimension definition so has ordered the columsn in this way.

    Perf team think that because columns at the end of table often accessed that more cpu is being consumed by having these at end rather than start.

    Saying twice the cpu used accessing these columsn at end of table compared to accessing columns at start of table.

    Any way to avoid this without having to reorg tables and fact twice cpu used what impact would this have on overall query speeds e.g. would there be a danger in having the more populated columns at start of table.

    Oracel warehouse builder has determined the oredr and I'm assuming it puts the key columns first and the null columns first in table order for a reason to make it more efficeint?

    Large dimension table 8 million rows

    Large fact table expect have 3 billion rows when finished - 1.5 billion at moment.


    Thanks

    Edited by: user5716448 on 13-Apr-2013 05:29
  • 5. Re: Order of columsn in table - how important from performance perspective
    rp0428 Guru
    Currently Being Moderated
    >
    Oracle Warehouse builder tool creates the tables based upon the dimension definition so has ordered the columsn in this way.

    Perf team think that because columns at the end of table often accessed that more cpu is being consumed by having these at end rather than start.
    >
    Post the reasons they gave you for thinking that.
    >
    Saying twice the cpu used accessing these columsn at end of table compared to accessing columns at start of table.
    >
    Post the proof that supports that statement. Do they have properly defined and executed tests that show you are using twice the cpu with those columns at the end of the table instead of at the beginning?

    If so, then you should modify your tables. If you think their tests and flawed, improperly executed or being interpreted wrong then post them so we can see how they reached their conclusions.
    >
    Any way to avoid this without having to reorg tables and fact twice cpu used what impact would this have on overall query speeds
    >
    So far, there is NOTHING to avoid. You haven't posted any reason to think you are having a problem other than a general, and unsubstantiated statment about what your performance team thinks.

    It it ain't broke, don't fix it.
    >
    e.g. would there be a danger in having the more populated columns at start of table.
    >
    No.
  • 6. Re: Order of columsn in table - how important from performance perspective
    EdStevens Guru
    Currently Being Moderated
    user5716448 wrote:
    Hi,


    Thanks for replies.

    It is a data warehouse and have a large number of columns that are often NULL since it is a dimension with 3 levels in the hierarchy - these are at start of table after the key columns.

    Order of columns

    Table has pk at start (dimension_key)
    all columns for lowest level of dimension (30 columns) these can be null
    then 20 columns which relate to next level in the dimension - again can be null
    then 10 columns for highest level of dimension - these always populated.

    Oracle Warehouse builder tool creates the tables based upon the dimension definition so has ordered the columsn in this way.

    Perf team think that because columns at the end of table often accessed that more cpu is being consumed by having these at end rather than start.
    Ask the "perf team" to quit thinking and produce a reproducible demonstration. At the very least, ask them to produce a technical explanation of why they "believe" this.

    Think about the process of retrieving a row, and a column within a row.

    First, the block containing the desired row is read. This will happen either after the block is identified by processing an index, yielding a block id, or the block is read as part of a full table scan. Either way the entire block gets read into a buffer. All of the activity to get the block into the buffer is going to occur regardless of what columns are requested. Once the block is in the buffer, the offset to the row has to be calculated. Is that calculation going to be more or less expensive based on the location of any particular columns within the row? Next, the offset to the requested columns is calculated. Is it going to more expensive to calculate an offset to a location closer to the start of the row than to a location further into the row?




    >
    Saying twice the cpu used accessing these columsn at end of table compared to accessing columns at start of table.
    And why does it take more cpu to caclulate the offset to column 512 than it does to calculate the offset to column 1?

    >
    Any way to avoid this without having to reorg tables and fact twice cpu used what impact would this have on overall query speeds e.g. would there be a danger in having the more populated columns at start of table.
    There's nothing to avoid except people who make performance claims based on .... fantasy.
    >
    Oracel warehouse builder has determined the oredr and I'm assuming it puts the key columns first and the null columns first in table order for a reason to make it more efficeint?

    Large dimension table 8 million rows

    Large fact table expect have 3 billion rows when finished - 1.5 billion at moment.


    Thanks

    Edited by: user5716448 on 13-Apr-2013 05:29
  • 7. Re: Order of columsn in table - how important from performance perspective
    rp0428 Guru
    Currently Being Moderated
    >
    And why does it take more cpu to caclulate the offset to column 512 than it does to calculate the offset to column 1?
    >
    Because Oracle has to start at the beginning of the row and then find the column of interest by 'skipping over' columns that are not needed. The more columns that need to be ignored the greater the number of 'skips' that need to occur and the more iterations it will take to get to column 512.

    Each column's data is preceded by a length byte so to get to column 2 you have to begin at the start of the row and add the length of column 1. Then to get to column three you need to add the length of column 2, and so on. There is an exception for consecutive null columns where Oracle encodes how many there are and can them skip all of them at once.

    This writeup by Jonathan Lewis mentions it tangential to the discussion of whether 'column order' matters in tables that tend to have several consecutive columns that are null.
    http://jonathanlewis.wordpress.com/2012/10/01/row-sizes-2/
  • 8. Re: Order of columsn in table - how important from performance perspective
    Girish Sharma Guru
    Currently Being Moderated
    For more details please read [url http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:2818414600346806270]Table columns position and performance impact, [url http://hoopercharles.wordpress.com/2010/05/22/column-order-in-a-table-does-it-matter-1/]Column Order in a Table – Does it Matter? 1. I found below below undocumented and not tested rules from below link :

    Group related columns together.
    Not NULL columns before null-able columns.
    Frequently searched un-indexed columns first.
    Rarely filled null-able columns last.
    Static columns first.
    Updateable varchar columns later.
    Indexed columns after other searchable columns.

    Source:http://stackoverflow.com/questions/4939735/re-order-columns-of-table-in-oracle

    Regards
    Girish Sharma
  • 9. Re: Order of columsn in table - how important from performance perspective
    Dom Brooks Guru
    Currently Being Moderated
    1) How big is the performance gain really?
    It is going to absolutely depend on your specific circumstances so test it.
    Should be easy to test a number of scenarios based on your specific circumstances.

    2) Can you easily rearrange order of columns in table or do you have to drop and recreate the table?
    Drop and recreate.
    dbms_redefinition (which is not a lot different).


    If it does make a significant difference then you are an edge case.
  • 10. Re: Order of columsn in table - how important from performance perspective
    Jonathan Lewis Oracle ACE Director
    Currently Being Moderated
    rp0428 wrote:

    Each column's data is preceded by a length byte so to get to column 2 you have to begin at the start of the row and add the length of column 1. Then to get to column three you need to add the length of column 2, and so on. There is an exception for consecutive null columns where Oracle encodes how many there are and can them skip all of them at once.
    That's the type of compression trick that Oracle uses in a few places, but not for null columns. The only special case for null columns is that Oracle doesn't record them when they're at the end of the row; otherwise a null column is represented by a single byte (0xff).

    In the case of the OP, with a fairly small number of columns the CPU benefit of moving columns around is likely to be small if you consider only the CPU spent once the block has been buffered and pinned; however he has a large number of rows with 30 null columns - if he moved those to the end of the row he would be saving 30 bytes per row in storage which might be a significant fraction of the row length, which might lead to more rows per block, which would lead to fewer blocks for the object, which might lead to better buffering of the object, and the combination of better buffering and more rows per block might lead to less CPU being used locating and loading blocks into the cache.

    The downside of being too fanatic about every CPU cycle is that your cunning tricks might be forgotten at some future time. For example, if someone decides to add a (non-null) column to his table every row would suddenly grow by an "unexpected" 30 bytes because the addition would require a byte to be included for the 30 null columns. Once you get clever with column positions you may have to rebuild the table every time you add a column.


    Regards
    Jonathan Lewis
  • 11. Re: Order of columsn in table - how important from performance perspective
    Jan-Marten Spit Explorer
    Currently Being Moderated
    "it's extremely unlikely that the order of columns in a table will matter from a performance perspective. If you have a table with more than 255 columns or if have a large number of columns that are often NULL, the order of columns may have some impact on performance. But for the vast majority of situations, it's not relevant."

    you are right about the >255 columns case, but i disagree on the other claim.

    i tested this extensively, and there IS relevance in terms of CPU usage. in fact, i observed the CBO CPU costing is actually taking the column position into account.

    you should put you frequently accessed columns near the front (low column_id), as oracle needs to calculate it's way to the column start when retrieving (or filtering away) data, and adding a few integers is less costly than adding dozens of integers. as stated above, for high cardinality filters this behavior may even cause the CBO to generate a 'unexpected' plan.

    you should also (as you point out) keep columns with most NULL values near the end, and in rare cases these criteria collide.
  • 12. Re: Order of columsn in table - how important from performance perspective
    Jan-Marten Spit Explorer
    Currently Being Moderated
    "At the very least, ask them to produce a technical explanation of why they "believe" this."

    well :)

    "Is it going to more expensive to calculate an offset to a location closer to the start of the row than to a location further into the row?"

    YES.

    "And why does it take more cpu to caclulate the offset to column 512 than it does to calculate the offset to column 1?"

    because Oracle does not maintain a per-column ofsset relative to the start of the row.

    oracle calculates the start of column X by adding the widths of columns 0 to X-1.

    as i stated elsewhere in this thread, the CBO CPU cost actually includes this penalty.

    eg, the CPU cost of filtering 1E6 rows on column 200 will be higher than filtering 1E6 rows on column 1, and the same holds for the actual CPU usage.

    now i would agree that in most cases the effect is practically negligable. but i have also seen cases where it (reordering the columns in a table) reduced CPU usage significantly. in order to do that, you must be so lucky that the developers created column-position independent SQL.
  • 13. Re: Order of columsn in table - how important from performance perspective
    EdStevens Guru
    Currently Being Moderated
    Jan-Marten Spit wrote:
    "At the very least, ask them to produce a technical explanation of why they "believe" this."

    well :)

    "Is it going to more expensive to calculate an offset to a location closer to the start of the row than to a location further into the row?"

    YES.

    "And why does it take more cpu to caclulate the offset to column 512 than it does to calculate the offset to column 1?"

    because Oracle does not maintain a per-column ofsset relative to the start of the row.

    oracle calculates the start of column X by adding the widths of columns 0 to X-1.

    as i stated elsewhere in this thread, the CBO CPU cost actually includes this penalty.
    Got it. Seems that I used to know that ...
    eg, the CPU cost of filtering 1E6 rows on column 200 will be higher than filtering 1E6 rows on column 1, and the same holds for the actual CPU usage.

    now i would agree that in most cases the effect is practically negligable. but i have also seen cases where it (reordering the columns in a table) reduced CPU usage significantly. in order to do that, you must be so lucky that the developers created column-position independent
    SQL.

    Yes, I would still expect it to be a HIGHLY exceptional case that it made a meaningful difference.
  • 14. Re: Order of columsn in table - how important from performance perspective
    Jan-Marten Spit Explorer
    Currently Being Moderated
    exceptional, yes ...but a bit more common in datawarehouses. it starts to contribute with full table scans and lots of rows and filtering.

    also, Siebel. Siebel loves tables with a lot of columns, and they are often customized with new columns - at the end. even if there are only a few dozen rows filter per query, there is still significant overhead if you have a lot of executions/s, not to mention the sapce waste if the trailing null columns are suddenly ammened by a not-null column.

    i mentioned the CBO taking the column position into account. i tested this once with a 600 column table, and found that indeed the CBO does see this cost, but not the overhead of the 255-columns-per-row-piece effect. in other words, the CBO thinks that accessing column 255 (the first in the next row piece) is only marginally worse than acessing column 254, even if it is extra-block chained (whcih the CBO canot possibly know with the current stats).
1 2 Previous Next

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points