1 2 Previous Next 17 Replies Latest reply: Jul 26, 2013 5:05 AM by Nikolay Savvinov RSS

    Index on column with only 2 distinct values

    DheerajSingh

      Hi All,

       

      Could i create an Index on column with only 2 distinct values('Y' and 'N').

       

      Thanks.

       

      Regards,

      Dheeraj

        • 1. Re: Index on column with only 2 distinct values
          SomeoneElse

          Technically yes.  But it might not be a good idea.

          • 2. Re: Index on column with only 2 distinct values
            Srikkanth.M

            Hi,

             

            Its a waste of creating index in this column,as index is used with the column that have lot of duplicate values and null values.

             

            regards

            Sri    

            • 4. Re: Index on column with only 2 distinct values
              Pacmann

              Hi,

              It is not necessarily a bad idea : if there are mostly one of these two values, and you are interested in that one, the index, with the corresponding histogram can help.

              • 5. Re: Index on column with only 2 distinct values
                _Karthick_

                You can't generalize and give an answer. Each case need to be addressed with its own specifications.

                 

                For example see this example

                 

                Consider this table where you have a column with evenly distributed values of Y and N like this.

                 

                SQL> drop table t;

                Table dropped.

                 

                SQL> create table t
                  2  as
                  3  select t.*
                  4       , case when rownum <= count(*) over()/2 then 'Y' else 'N' end as flag
                  5    from all_objects t;

                Table created.

                 

                SQL> create index t_idx on t(flag);

                Index created.

                 

                SQL> exec dbms_stats.gather_table_stats(user, 'T', cascade=>true)

                PL/SQL procedure successfully completed.

                 

                Now see the execution plan and cost for a index scan vs FTS. Index definitely does good.

                 

                SQL> select count(*)
                  2    from t
                  3   where flag = 'Y';

                  COUNT(*)
                ----------
                    106680

                 

                SQL> select * from table(dbms_xplan.display_cursor);

                 

                PLAN_TABLE_OUTPUT
                -----------------------------------------------------------------------------------
                SQL_ID  43r5crkmtdb1q, child number 0
                -------------------------------------
                select count(*)   from t  where flag = 'Y'

                Plan hash value: 3482591947

                ---------------------------------------------------------------------------
                | Id  | Operation         | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
                ---------------------------------------------------------------------------
                |   0 | SELECT STATEMENT  |       |       |       |    60 (100)|          |
                |   1 |  SORT AGGREGATE   |       |     1 |     2 |            |          |
                |*  2 |   INDEX RANGE SCAN| T_IDX |   106K|   208K|    60   (4)| 00:00:01 |
                ---------------------------------------------------------------------------

                Predicate Information (identified by operation id):
                ---------------------------------------------------

                   2 - access("FLAG"='Y')


                19 rows selected.

                 

                SQL> select /*+FULL(t)*/ count(*)
                  2    from t
                  3   where flag = 'Y';

                 

                  COUNT(*)
                ----------
                    106680

                 

                SQL> select * from table(dbms_xplan.display_cursor);

                 

                PLAN_TABLE_OUTPUT
                -----------------------------------------------------------------------------------
                SQL_ID  gmaf75wmfbqb4, child number 0
                -------------------------------------
                select /*+FULL(t)*/ count(*)   from t  where flag = 'Y'

                Plan hash value: 1842905362

                ---------------------------------------------------------------------------
                | Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
                ---------------------------------------------------------------------------
                |   0 | SELECT STATEMENT   |      |       |       |   835 (100)|          |
                |   1 |  SORT AGGREGATE    |      |     1 |     2 |            |          |
                |*  2 |   TABLE ACCESS FULL| T    |   106K|   208K|   835   (5)| 00:00:11 |
                ---------------------------------------------------------------------------

                Predicate Information (identified by operation id):
                ---------------------------------------------------

                   2 - filter("FLAG"='Y')


                19 rows selected.

                 

                SQL>

                • 6. Re: Index on column with only 2 distinct values
                  myOra_help

                  Hi,

                  If your Table is not updated frequently then you can consider to create BITMAP index. BITMAP index is always advisable for systems in which data is not frequently updated by many concurrent systems and generally appropriate for columns having low cardinality (low distinct values) such as GENDER, MARITAL_STATUS, and RELATION.

                  The size of the bitmap index depends on the cardinality of the column on which it is created as well as the data distribution. Consequently, a bitmap index on the GENDER column will be smaller than a B-tree index on the same column. In contrast, a bitmap index on EMPNO (a candidate for primary key) will be much larger than a B-tree index on this column.

                   

                  http://www.oracle.com/technetwork/articles/sharma-indexes-093638.html

                  http://docs.oracle.com/cd/B28359_01/server.111/b28313/indexes.htm

                  • 7. Re: Index on column with only 2 distinct values
                    padders

                    Particularly useful if you want to know how many 'Y's there were in the table a few seconds ago.

                    • 8. Re: Index on column with only 2 distinct values
                      _Karthick_

                      padders wrote:

                       

                      Particularly useful if you want to know how many 'Y's there were in the table a few seconds ago.

                       

                      What?

                      • 9. Re: Index on column with only 2 distinct values
                        Lalit Kumar B

                        DheerajSingh wrote:

                         

                        Could i create an Index on column with only 2 distinct values('Y' and 'N').

                         

                        If I take your question literally, the answer is YES. You have all the right to create an index. The actual question is why do you want to do that?

                         

                        It's not necessary that if you create an index then it will be used by the CBO. Anyway, you can give it a try and check the clustering factor of the index. There are hundreds of reasons for saying a YES or NO to your question.

                        • 10. Re: Index on column with only 2 distinct values
                          padders

                          > select count(*) from t where flag = 'Y'

                           

                          I was implying that your test query was not representative of something you would actually do and hence does not seem to be a very useful test.

                          • 11. Re: Index on column with only 2 distinct values
                            Etbin

                            You can check what might happen => https://forums.oracle.com/message/11118414#11118414

                             

                            Regards

                             

                            Etbin

                            • 12. Re: Index on column with only 2 distinct values
                              bencol

                              A single bitmap index on a gender column may not be much more useful than a b-tree index. The power of bitmap indexes is in their ability to be combined, and not in the fact that they are stored efficiently for columns with low cardinality. They are useful when you cannot easily predict the filters that will be used to access the data, this is why they can be useful in reporting databases.

                               

                              For there to be a benefit for a single index, then there must be significant skew towards one value. In this case either a histogram or a function based index, DECODEing the common value to null could be appropriate. If there is not significant skew in the values, and you need to access other data in the table, and there are no other filters, then a full table scan will be most efficient

                              • 13. Re: Index on column with only 2 distinct values
                                _Karthick_

                                padders wrote:

                                 

                                > select count(*) from t where flag = 'Y'

                                 

                                I was implying that your test query was not representative of something you would actually do and hence does not seem to be a very useful test.

                                 

                                You don't do COUNT? We do it all the time around here.

                                • 14. Re: Index on column with only 2 distinct values
                                  padders

                                  What do you use it for? Perhaps you should consider whether you really need to.


                                  "Counting records is not smart, inefficient and should always be avoided", Tom Kyte, November 2009

                                  1 2 Previous Next