This discussion is archived
1 2 Previous Next 17 Replies Latest reply: Jul 26, 2013 3:05 AM by Nikolay Savvinov RSS

Index on column with only 2 distinct values

DheerajSingh Newbie
Currently Being Moderated

Hi All,

 

Could i create an Index on column with only 2 distinct values('Y' and 'N').

 

Thanks.

 

Regards,

Dheeraj

  • 1. Re: Index on column with only 2 distinct values
    SomeoneElse Guru
    Currently Being Moderated

    Technically yes.  But it might not be a good idea.

  • 2. Re: Index on column with only 2 distinct values
    Srikkanth.M Pro
    Currently Being Moderated

    Hi,

     

    Its a waste of creating index in this column,as index is used with the column that have lot of duplicate values and null values.

     

    regards

    Sri    

  • 3. Re: Index on column with only 2 distinct values
    Rahul_India Journeyer
    Currently Being Moderated

    Someone why?

  • 4. Re: Index on column with only 2 distinct values
    Pacmann Journeyer
    Currently Being Moderated

    Hi,

    It is not necessarily a bad idea : if there are mostly one of these two values, and you are interested in that one, the index, with the corresponding histogram can help.

  • 5. Re: Index on column with only 2 distinct values
    Karthick_Arp Guru
    Currently Being Moderated

    You can't generalize and give an answer. Each case need to be addressed with its own specifications.

     

    For example see this example

     

    Consider this table where you have a column with evenly distributed values of Y and N like this.

     

    SQL> drop table t;

    Table dropped.

     

    SQL> create table t
      2  as
      3  select t.*
      4       , case when rownum <= count(*) over()/2 then 'Y' else 'N' end as flag
      5    from all_objects t;

    Table created.

     

    SQL> create index t_idx on t(flag);

    Index created.

     

    SQL> exec dbms_stats.gather_table_stats(user, 'T', cascade=>true)

    PL/SQL procedure successfully completed.

     

    Now see the execution plan and cost for a index scan vs FTS. Index definitely does good.

     

    SQL> select count(*)
      2    from t
      3   where flag = 'Y';

      COUNT(*)
    ----------
        106680

     

    SQL> select * from table(dbms_xplan.display_cursor);

     

    PLAN_TABLE_OUTPUT
    -----------------------------------------------------------------------------------
    SQL_ID  43r5crkmtdb1q, child number 0
    -------------------------------------
    select count(*)   from t  where flag = 'Y'

    Plan hash value: 3482591947

    ---------------------------------------------------------------------------
    | Id  | Operation         | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
    ---------------------------------------------------------------------------
    |   0 | SELECT STATEMENT  |       |       |       |    60 (100)|          |
    |   1 |  SORT AGGREGATE   |       |     1 |     2 |            |          |
    |*  2 |   INDEX RANGE SCAN| T_IDX |   106K|   208K|    60   (4)| 00:00:01 |
    ---------------------------------------------------------------------------

    Predicate Information (identified by operation id):
    ---------------------------------------------------

       2 - access("FLAG"='Y')


    19 rows selected.

     

    SQL> select /*+FULL(t)*/ count(*)
      2    from t
      3   where flag = 'Y';

     

      COUNT(*)
    ----------
        106680

     

    SQL> select * from table(dbms_xplan.display_cursor);

     

    PLAN_TABLE_OUTPUT
    -----------------------------------------------------------------------------------
    SQL_ID  gmaf75wmfbqb4, child number 0
    -------------------------------------
    select /*+FULL(t)*/ count(*)   from t  where flag = 'Y'

    Plan hash value: 1842905362

    ---------------------------------------------------------------------------
    | Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
    ---------------------------------------------------------------------------
    |   0 | SELECT STATEMENT   |      |       |       |   835 (100)|          |
    |   1 |  SORT AGGREGATE    |      |     1 |     2 |            |          |
    |*  2 |   TABLE ACCESS FULL| T    |   106K|   208K|   835   (5)| 00:00:11 |
    ---------------------------------------------------------------------------

    Predicate Information (identified by operation id):
    ---------------------------------------------------

       2 - filter("FLAG"='Y')


    19 rows selected.

     

    SQL>

  • 6. Re: Index on column with only 2 distinct values
    myOra_help Journeyer
    Currently Being Moderated

    Hi,

    If your Table is not updated frequently then you can consider to create BITMAP index. BITMAP index is always advisable for systems in which data is not frequently updated by many concurrent systems and generally appropriate for columns having low cardinality (low distinct values) such as GENDER, MARITAL_STATUS, and RELATION.

    The size of the bitmap index depends on the cardinality of the column on which it is created as well as the data distribution. Consequently, a bitmap index on the GENDER column will be smaller than a B-tree index on the same column. In contrast, a bitmap index on EMPNO (a candidate for primary key) will be much larger than a B-tree index on this column.

     

    http://www.oracle.com/technetwork/articles/sharma-indexes-093638.html

    http://docs.oracle.com/cd/B28359_01/server.111/b28313/indexes.htm

  • 7. Re: Index on column with only 2 distinct values
    padders Pro
    Currently Being Moderated

    Particularly useful if you want to know how many 'Y's there were in the table a few seconds ago.

  • 8. Re: Index on column with only 2 distinct values
    Karthick_Arp Guru
    Currently Being Moderated

    padders wrote:

     

    Particularly useful if you want to know how many 'Y's there were in the table a few seconds ago.

     

    What?

  • 9. Re: Index on column with only 2 distinct values
    Lalit Kumar B Explorer
    Currently Being Moderated

    DheerajSingh wrote:

     

    Could i create an Index on column with only 2 distinct values('Y' and 'N').

     

    If I take your question literally, the answer is YES. You have all the right to create an index. The actual question is why do you want to do that?

     

    It's not necessary that if you create an index then it will be used by the CBO. Anyway, you can give it a try and check the clustering factor of the index. There are hundreds of reasons for saying a YES or NO to your question.

  • 10. Re: Index on column with only 2 distinct values
    padders Pro
    Currently Being Moderated

    > select count(*) from t where flag = 'Y'

     

    I was implying that your test query was not representative of something you would actually do and hence does not seem to be a very useful test.

  • 11. Re: Index on column with only 2 distinct values
    Etbin Guru
    Currently Being Moderated

    You can check what might happen => https://forums.oracle.com/message/11118414#11118414

     

    Regards

     

    Etbin

  • 12. Re: Index on column with only 2 distinct values
    bencol Pro
    Currently Being Moderated

    A single bitmap index on a gender column may not be much more useful than a b-tree index. The power of bitmap indexes is in their ability to be combined, and not in the fact that they are stored efficiently for columns with low cardinality. They are useful when you cannot easily predict the filters that will be used to access the data, this is why they can be useful in reporting databases.

     

    For there to be a benefit for a single index, then there must be significant skew towards one value. In this case either a histogram or a function based index, DECODEing the common value to null could be appropriate. If there is not significant skew in the values, and you need to access other data in the table, and there are no other filters, then a full table scan will be most efficient

  • 13. Re: Index on column with only 2 distinct values
    Karthick_Arp Guru
    Currently Being Moderated

    padders wrote:

     

    > select count(*) from t where flag = 'Y'

     

    I was implying that your test query was not representative of something you would actually do and hence does not seem to be a very useful test.

     

    You don't do COUNT? We do it all the time around here.

  • 14. Re: Index on column with only 2 distinct values
    padders Pro
    Currently Being Moderated

    What do you use it for? Perhaps you should consider whether you really need to.


    "Counting records is not smart, inefficient and should always be avoided", Tom Kyte, November 2009

1 2 Previous Next

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points