This discussion is archived
10 Replies Latest reply: Dec 4, 2013 9:16 AM by NightWing RSS

Reg: Index on column containing NULLs

ranit B Expert
Currently Being Moderated

Hi Experts,

 

I have a doubt regarding creating indexes on columns containing NULLs (mostly) and other values.

 

Query:

SELECT

     a.col_1,

     a.col_2,

     a.col_3

FROM table_a a, table_b b

WHERE

     a.col_id = b.col_id AND

     b.col_1 = 'ABC' AND

     b.col_2 IN (1, 2, 3, 4) AND

     b.col_3 IS NOT NULL;

 

What kind of index will suit this scenario best?

 

I'm aware that B-tree indexes does not hold NULL values, wheres Bitmap does.

 

Also, read this but confused -- http://www.dba-oracle.com/oracle_tips_null_idx.htm

 

Any pointers regarding this. Help much appreciated.

 

-- Ranit

(on Oracle 11.2.0.3)

  • 1. Re: Reg: Index on column containing NULLs
    Suri Pro
    Currently Being Moderated

    Hi Ranit,

     

     

     

    Also, read this but confused -- http://www.dba-oracle.com/oracle_tips_null_idx.htm

     

    Any pointers regarding this. Help much appreciated.

     

    -- Ranit

    (on Oracle 11.2.0.3)

     

    Tom and other experts mentioned not to follow this site

     

    https://forums.oracle.com/message/11276431#11276431

     

     

    Cheers,

    Suri ;-)

  • 2. Re: Reg: Index on column containing NULLs
    ranit B Expert
    Currently Being Moderated

     

    Tom and other experts mentioned not to follow this site

     

    https://forums.oracle.com/message/11276431#11276431

     

     

    Yes Suri, I know that... I read that discussion long back.

     

    But, the information looked interesting... so read (though might be wrong, no comments on that... I'm just a kiddo before BC).

  • 3. Re: Reg: Index on column containing NULLs
    bencol Pro
    Currently Being Moderated

    Ranit,

     

    In your case you don't need to index the null values as you are only interested in the not null values. If col_3 is mostly null then a b-tree index on it might be benefical, see this simple example:

    sql> create table bc_test
      2  as select rownum col1
      3           ,case when val < 10 then
      4              trunc(val)
      5            else
      6              null
      7            end col2
      8     from   (select dbms_random.value(1,100000) val
      9             from   dual
    10             connect by level <= 100000
    11            );

    Table created.

    sql>

    sql> select col2,count(*) from bc_test group by col2 order by 1;

          COL2   COUNT(*)
    __________ __________
             1          4
             3          2
             4          2
             5          1
             6          1
             8          1
             9          1
                    99988

    sql>
    sql> create index bc_ix01 on bc_test (col2);

    Index created.

    sql>
    sql> exec dbms_stats.gather_table_stats(user,'BC_TEST',cascade=>true)

    PL/SQL procedure successfully completed.

    sql>
    sql> explain plan for select * from bc_test where col2 is not null;

    Explained.

    sql>
    sql> select plan_table_output from table(dbms_xplan.display('plan_table',null,'serial'));

    PLAN_TABLE_OUTPUT
    ____________________________________________________________________________________________
    Plan hash value: 3960122628

    ---------------------------------------------------------------------------------------
    | Id  | Operation                   | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
    ---------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT            |         |    12 |    72 |    13   (0)| 00:00:01 |
    |   1 |  TABLE ACCESS BY INDEX ROWID| BC_TEST |    12 |    72 |    13   (0)| 00:00:01 |
    |*  2 |  INDEX FULL SCAN           | BC_IX01 |    12 |       |     1   (0)| 00:00:01 |
    ---------------------------------------------------------------------------------------

    Predicate Information (identified by operation id):
    ---------------------------------------------------

       2 - filter("COL2" IS NOT NULL)

    14 rows selected.

     

    If you have a table with a column where null is rare and this is a filter in your queries, then you might want to include a value in the column expression to include the nulls in the index. (as in the great Don's example) e.g.

    sql> drop table bc_test;

    Table dropped.

    sql>
    sql> create table bc_test
      2  as select rownum col1
      3           ,case when val > 10 then
      4              trunc(val)
      5            else
      6              null
      7            end col2
      8     from   (select dbms_random.value(1,100000) val
      9             from   dual
    10             connect by level <= 100000
    11            );

    Table created.

    sql>
    sql> create index bc_ix01 on bc_test (col2,1);

    Index created.

    sql>
    sql> exec dbms_stats.gather_table_stats(user,'BC_TEST',cascade=>true)

    PL/SQL procedure successfully completed.

    sql>
    sql> explain plan for select * from bc_test where col2 is null;

    Explained.

    sql>
    sql> select plan_table_output from table(dbms_xplan.display('plan_table',null,'serial'));

    PLAN_TABLE_OUTPUT
    ______________________________________________________________________________________________
    Plan hash value: 2307915924

    ---------------------------------------------------------------------------------------
    | Id  | Operation                   | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
    ---------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT            |         |     7 |    70 |     9   (0)| 00:00:01 |
    |   1 |  TABLE ACCESS BY INDEX ROWID| BC_TEST |     7 |    70 |     9   (0)| 00:00:01 |
    |*  2 |   INDEX RANGE SCAN          | BC_IX01 |     7 |       |     2   (0)| 00:00:01 |
    ---------------------------------------------------------------------------------------

    Predicate Information (identified by operation id):
    ---------------------------------------------------

       2 - access("COL2" IS NULL)

     

    Ben

  • 4. Re: Reg: Index on column containing NULLs
    ranit B Expert
    Currently Being Moderated

    Thanks for the nice demo, Ben.

    Your time and effort spent is really appreciated.

     

    But,

    If col_3 is mostly null then a b-tree index on it might be benefical,

    AFAIK, b-tree index can't store NULLs... and Bitmap index can work in those cases.

     

    Also, I'm not very sure if your demonstration is similar to my query because in your case it is 'SELECT *...' whereas in my case it is just used as a kindof lookup. In such case, it should go for FFS... right?

     

    Please rectify me if I'm getting things wrongly.

     

    -- Ranit

  • 5. Re: Reg: Index on column containing NULLs
    bencol Pro
    Currently Being Moderated

    But you don't care about the nulls

    b.col_3 IS NOT NULL;

     

    And their presence (or not) in the index doesn't change the cardinality of the index lookup. So if the not nulls are in the index then the index scan is an option. For your filter (not null) the optimizer will know that this is quite selective as there are not many entries in the index for this value. The lack of nulls in the index doesn't matter, the number of blocks that the optimizer will estimate reading will be less for the index lookup. That is what I was trying to show.

     

    Ben

  • 6. Re: Reg: Index on column containing NULLs
    NightWing Journeyer
    Currently Being Moderated

    bencol wrote:

     

    Ranit,

     

    In your case you don't need to index the null values as you are only interested in the not null values. If col_3 is mostly null then a b-tree index on it might be benefical, see this simple example:

    sql> create table bc_test
      2  as select rownum col1
      3           ,case when val < 10 then
      4              trunc(val)
      5            else
      6              null
      7            end col2
      8     from   (select dbms_random.value(1,100000) val
      9             from   dual
    10             connect by level <= 100000
    11            );

    Table created.

    sql>

    sql> select col2,count(*) from bc_test group by col2 order by 1;

          COL2   COUNT(*)
    __________ __________
             1          4
             3          2
             4          2
             5          1
             6          1
             8          1
             9          1
                    99988

    sql>
    sql> create index bc_ix01 on bc_test (col2);

    Index created.

    sql>
    sql> exec dbms_stats.gather_table_stats(user,'BC_TEST',cascade=>true)

    PL/SQL procedure successfully completed.

    sql>
    sql> explain plan for select * from bc_test where col2 is not null;

    Explained.

    sql>
    sql> select plan_table_output from table(dbms_xplan.display('plan_table',null,'serial'));

    PLAN_TABLE_OUTPUT
    ____________________________________________________________________________________________
    Plan hash value: 3960122628

    ---------------------------------------------------------------------------------------
    | Id  | Operation                   | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
    ---------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT            |         |    12 |    72 |    13   (0)| 00:00:01 |
    |   1 |  TABLE ACCESS BY INDEX ROWID| BC_TEST |    12 |    72 |    13   (0)| 00:00:01 |
    |*  2 |  INDEX FULL SCAN           | BC_IX01 |    12 |       |     1   (0)| 00:00:01 |
    ---------------------------------------------------------------------------------------

    Predicate Information (identified by operation id):
    ---------------------------------------------------

       2 - filter("COL2" IS NOT NULL)

    14 rows selected.

     

     

     

     

     

    Hello Ben,

     

    One thing that I don't understand is in your first explain plan, as you marked in red, how come INDEX FULL SCAN is used and FILTER predicate performed? I thought when any index is used,  ACCESS predicate performing. Could you please explain this?

     

    Thanks in advanced.

  • 7. Re: Reg: Index on column containing NULLs
    bencol Pro
    Currently Being Moderated

    NightWing,

     

    The not null is telling the optmizer that it need to look at every row in the index, so a full scan is required. I think the filter predicate is used because of the full scan. It would be the access predicate for a range scan - start stop keys are required to define the range. If I define the range as "where col2 between 1 and 9" the access predicate is used

     

    Ben

  • 8. Re: Reg: Index on column containing NULLs
    NightWing Journeyer
    Currently Being Moderated

    bencol wrote:

     

    NightWing,

     

    The not null is telling the optmizer that it need to look at every row in the index, so a full scan is required. I think the filter predicate is used because of the full scan. It would be the access predicate for a range scan - start stop keys are required to define the range. If I define the range as where col2 between 1 and 9 the access predicate is used

     

    Ben

     

    But, if you look at the explain plain, it says INDEX FULL SCAN

     

    |*  2 |  INDEX FULL SCAN           | BC_IX01 |    12 |       |     1   (0)| 00:00:01 |

     

     

    I don't understand

  • 9. Re: Reg: Index on column containing NULLs
    bencol Pro
    Currently Being Moderated

    I have not supplied start and stop keys, so the whole index needs to be scanned to meet the predicate - col2 is not null - this implies every row in the index, as the nulls are not in the index. Once oracle has every rowid from the index it can retrieve the data from the table (12 out of 100k rows)

     

    How do you think the index should be accessed? What range am I supplying for a range scan?

  • 10. Re: Reg: Index on column containing NULLs
    NightWing Journeyer
    Currently Being Moderated

    bencol wrote:

     

    I have not supplied start and stop keys, so the whole index needs to be scanned to meet the predicate - col2 is not null - this implies every row in the index, as the nulls are not in the index. Once oracle has every rowid from the index it can retrieve the data from the table (12 out of 100k rows)

     

    How do you think the index should be accessed? What range am I supplying for a range scan?

     

    Before read, first of all please don't angry with me, I am just trying to understand. I am not professional like you. So, thanks for your patient.

     

    sql> explain plan for select * from bc_test where col2 is not null;


    As you can say, the query wants every row in the index(index full scan) because nulls are not stored in the index. So, get all rowsids in the index (12 rows) then access table with these rowids. So,I still don't understand why 2 - filter("COL2" IS NOT NULL) is used.


    Can you please explain?


    Thanks in advance.

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points