Forum Stats

  • 3,854,212 Users
  • 2,264,340 Discussions
  • 7,905,609 Comments

Discussions

Composite index vs Single column index

Hesipesi
Hesipesi Member Posts: 188 Blue Ribbon
edited Dec 16, 2015 10:51AM in General Database Discussions

Hi,

I have a table with a,b,c,d,e,f,g,h,i,j,k. columns and i have an index on a,b columns!

There is a sql statement now with where a= ?  and we are wondering if it could also be good to add a single index on just 'a' column!

doesn't help AT ALL? it does help in SOME CASES? or??? 

what do you think??

Thanks!

/Hesi

Michal ŠimoníkjgarryHesipesi

Best Answer

  • Jonathan Lewis
    Jonathan Lewis Member Posts: 10,061 Blue Diamond
    edited Dec 14, 2015 12:37PM Answer ✓
    HeSi9466 wrote:
    
    I think, In general when you have a composite index like (a,b) and you have a query like where a=? then that composite index is enough and you don't need to also add a single index!
    But...I think this answer CAN NOT be a final answer for all the possible combination of data.
    For instance if the leading index column (a) has a low number of distinct values and the second index column(b) has a huge number of different values then a query which looks for where a=? could be better off with a single index an (a) column.
    My point is, this needs to be tested and one can't say NO, NO SINGLE INDEX IS NEEDED at all.
    
    
    

    The basic answer is that you don NOT need the single column index if you have the two-column index.

    The complex answer is that you may have to spend some time and effort ensuring that the two-column index is used in all cases when it would have been appropriate to use the single column index. This may simple mean ensuring the clustering_factor of the index is adjusted suitably so that the optimizer "likes" the index enough and/or you have to modify some code to use the cluster_by_rowid hint (when you're at 12c) so that you don't suffer a performance impact at run-time.

    Key factors: the two-column index will be physically larger than the single column index - this will increase the (optimizer's estimated) cost of using it; the clustering_factor of the two_column index will almost certainly be larger than the clustering_factor of the single column index - this will also increase the (optimizer's estimated) cost of using it. These two points are echoed at run-time: the two column index will be bigger so you will have to do more work to read the relevant rowids, and if you walk the two-column index in order for a given value of the first column you will visit the table blocks in a different order compared to the order of visits from the single column index - this may result in the query actually doing more work at run-time.

    The change in the index leaf_block count is often insignificant (especially if, as per your example, the number of rows - hence blocks in the table - is large); the impact of the clustering_factor can make a dramatic difference to the cost calculations; but you can often work around this. In 11.2.0.4, particularly, you can use the dbms_stats.set_table_prefs() call to set the 'table_cached_blocks' parameter for a table so that all its indexes look more desirable to the optimizer.

    Bottom line - you don't need the single column index, but if you currently have it and want to drop it the human effort required to ensure that it can be dropped without side effects may make you decide to keep it anyway, especially if it doesn't seem to be causing any concurrency or other performance overheads.  If you don't have it yet, then you shouldn't need to create it.

    Regards

    Jonathan Lewis

    Michal ŠimoníkjgarryHesipesi

Answers

  • AndrewSayer
    AndrewSayer Member Posts: 13,007 Gold Crown
    edited Dec 11, 2015 7:41AM

    It depends!

    If you are just doing

    select a,b from table where a = :a

    Then yes an index on (a,b) would mean you wouldn't need to visit the table at all. BUT the index will be twice as large, whether or not that makes a difference to the b-level (and therefore IO) is up to the size of the dataset.

    You can still use an index on (a,b) if you are selecting all data for a =:a but you may have additional IO in the index (as above), you still have to access the table (no change there), but perhaps most importantly the stats will be different (cluster facter, leaf blocks..) so Oracle will calculate a different cost in using the index so your plans could be different.

    Best way of finding out is to test the theory with your REAL data set (stats and all)

  • Hesipesi
    Hesipesi Member Posts: 188 Blue Ribbon
    edited Dec 11, 2015 8:21AM

    So you mean I can add the single index in the table and test to see if it makes an improvement on the performance?

    The reason we have that composite index with (a,b) is that, there are other SQL statements which are using that index. So the question is, could it be helpful if I add another single index for some queries which are using the (a) column???  OR we will be just fine with that (a,b) index even with statements like where a= ?    ??????

  • Pavan Kumar
    Pavan Kumar Member Posts: 11,904 Gold Crown
    edited Dec 11, 2015 10:34AM
    HeSi9466 wrote:
    
    So you mean I can add the single index in the table and test to see if it makes an improvement on the performance?
    
    The reason we have that composite index with (a,b) is that, there are other SQL statements which are using that index. So the question is, could it be helpful if I add another single index for some queries which are using the (a) column???  OR we will be just fine with that (a,b) index even with statements like where a= ?    ??????
    

    1. Reason for adding index is to help/address the issue of efficient of block fetches. Now, you don't have any issue with existing index in performance wise, I don't suggest for one more index. It would be incurring additional maintenance during dml operations on table segment.

    - Pavan Kumar N

  • jgarry
    jgarry Member Posts: 13,844 Gold Crown
    edited Dec 11, 2015 12:03PM

    I'm sure both Richard Foote and Jonathan Lewis have addressed that question, but don't have time to find the posts.  But it has been demonstrated that an extra index may give the optimizer more information, even if it isn't used.  And certainly having too many indices can slow things down.

    Try here: https://richardfoote.wordpress.com/?s=concatenated+index&searchbutton=go!

  • Unknown
    edited Dec 11, 2015 12:17PM
    I have a table with a,b,c,d,e,f,g,h,i,j,k. columns and i have an index on a,b columns!
    
    There is a sql statement now with where a= ?  and we are wondering if it could also be good to add a single index on just 'a' column!
    
    doesn't help AT ALL? it does help in SOME CASES? or???  
    
    what do you think??
    

    Why don't you tell us what YOU think?

    What possible combinations have you thought about and tested? Here is just one

    1. the table has columns a,b,c,d,e,f,g,h,i,j,k

    2. ALL rows have a value of '1' for column a

    Will a 'single index' on just 'a' help AT ALL if users are looking for 'a = 1'?

    Post your answer.

    1. the table has columns a,b,c,d,e,f,g,h,i,j,k

    2. only ONE row has a value of '1' for column a

    Will a 'single index' on just 'a' help AT ALL if users are looking for 'a = 1'?

    Post your answer.

    Don't be afraid of breaking Oracle by actually TRYING things.

    That is often the best, and fastest, way to learn.

  • Hesipesi
    Hesipesi Member Posts: 188 Blue Ribbon
    edited Dec 14, 2015 10:25AM

    Hi,

    I think, In general when you have a composite index like (a,b) and you have a query like where a=? then that composite index is enough and you don't need to also add a single index!

    But...I think this answer CAN NOT be a final answer for all the possible combination of data.

    For instance if the leading index column (a) has a low number of distinct values and the second index column(b) has a huge number of different values then a query which looks for where a=? could be better off with a single index an (a) column.

    My point is, this needs to be tested and one can't say NO, NO SINGLE INDEX IS NEEDED at all.

    What do you think?   please correct me if I'm wrong!

    The answer to your questions

    1. no it won't help since a full table scan needs to be done as well

    2. Yeah, it will help.

  • BPeaslandDBA
    BPeaslandDBA Member Posts: 4,615 Blue Diamond
    edited Dec 14, 2015 10:54AM

    The only way to know for sure is to test it out with your specific table, the queries that will be executed against it, and with your data. Create the index and see if it helps performance.

    Keep in mind that while an index may help a SELECT, it can hamper DML operations because the index may need to be modified. Having a lot of indexes to help out a lot of different SELECT statements can improve performance in one area but hurt performance in another.

    Cheers,
    Brian

    HesipesiHesipesi
  • BPeaslandDBA
    BPeaslandDBA Member Posts: 4,615 Blue Diamond
    edited Dec 14, 2015 11:43AM

    Timely video you might be interested in. It was published today and seems to be something that you are looking for:

    https://www.youtube.com/watch?v=8II2h64NMBs

    Cheers,
    Brian

    HesipesiHesipesi
  • Jonathan Lewis
    Jonathan Lewis Member Posts: 10,061 Blue Diamond
    edited Dec 14, 2015 12:37PM Answer ✓
    HeSi9466 wrote:
    
    I think, In general when you have a composite index like (a,b) and you have a query like where a=? then that composite index is enough and you don't need to also add a single index!
    But...I think this answer CAN NOT be a final answer for all the possible combination of data.
    For instance if the leading index column (a) has a low number of distinct values and the second index column(b) has a huge number of different values then a query which looks for where a=? could be better off with a single index an (a) column.
    My point is, this needs to be tested and one can't say NO, NO SINGLE INDEX IS NEEDED at all.
    
    
    

    The basic answer is that you don NOT need the single column index if you have the two-column index.

    The complex answer is that you may have to spend some time and effort ensuring that the two-column index is used in all cases when it would have been appropriate to use the single column index. This may simple mean ensuring the clustering_factor of the index is adjusted suitably so that the optimizer "likes" the index enough and/or you have to modify some code to use the cluster_by_rowid hint (when you're at 12c) so that you don't suffer a performance impact at run-time.

    Key factors: the two-column index will be physically larger than the single column index - this will increase the (optimizer's estimated) cost of using it; the clustering_factor of the two_column index will almost certainly be larger than the clustering_factor of the single column index - this will also increase the (optimizer's estimated) cost of using it. These two points are echoed at run-time: the two column index will be bigger so you will have to do more work to read the relevant rowids, and if you walk the two-column index in order for a given value of the first column you will visit the table blocks in a different order compared to the order of visits from the single column index - this may result in the query actually doing more work at run-time.

    The change in the index leaf_block count is often insignificant (especially if, as per your example, the number of rows - hence blocks in the table - is large); the impact of the clustering_factor can make a dramatic difference to the cost calculations; but you can often work around this. In 11.2.0.4, particularly, you can use the dbms_stats.set_table_prefs() call to set the 'table_cached_blocks' parameter for a table so that all its indexes look more desirable to the optimizer.

    Bottom line - you don't need the single column index, but if you currently have it and want to drop it the human effort required to ensure that it can be dropped without side effects may make you decide to keep it anyway, especially if it doesn't seem to be causing any concurrency or other performance overheads.  If you don't have it yet, then you shouldn't need to create it.

    Regards

    Jonathan Lewis

    Michal ŠimoníkjgarryHesipesi
  • Hesipesi
    Hesipesi Member Posts: 188 Blue Ribbon
    edited Dec 16, 2015 10:51AM

    Many thanks to all of you who helped me to find the answer to my question!

    I think I've seen you Jonathan this year at OOW in SF! It's a pleasure to here from you again!

    Regards

    /Hesi

This discussion has been closed.