SQL & PL/SQL

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

What index to use for "where col is IS NOT NULL" query

PuraVidaOTNOct 25 2013 — edited Oct 25 2013

If I need a query of the form:

select * from INV_TABLE where R_TIMEOUT IS NOT NULL

What kind of index do you suggest on the R_TIMEOUT column?

Note that I am only interested in whether it is being null or not.

Thanks.

This post has been answered by John Spencer on Oct 25 2013

Jump to Answer

John Spencer

Answer

It would really depend on how many of the rows were null compared to then number of not null rows. A single column index on r_timeout would only index the not null values, so, if most of the rows have a null value for r_timeout then that might be fairly efficient as an access path. However, if most of the rows are not null, then it is likely that a full table scan would be most efficient, particularly if you are selecting all of the columns from the table.

John

Marked as Answer by PuraVidaOTN · Sep 27 2020

Karthick2003

Why do you want to index that column. That is the question you need to be asking.

One basic thing is oracle does not index null values. Check this out

SQL> create table t
  2  as
  3  select level no, 'xxx' name
  4    from dual
  5  connect by level <= 10000;

Table created.

SQL> insert into t
  2  select null, 'xxx'
  3    from dual
  4  connect by level <= 100;

100 rows created.

SQL> commit;

Commit complete.

SQL> create index t_idx on t(no);

Index created.

SQL> exec dbms_stats.gather_table_stats(user, 'T', cascade=>true)

PL/SQL procedure successfully completed.

SQL> select count(*) from t;

  COUNT(*)
----------
     10100

SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------
SQL_ID  cyzznbykb509s, child number 0
-------------------------------------
select count(*) from t
Plan hash value: 1842905362
-------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Cost (%CPU)| Time     |
-------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |     8 (100)|          |
|   1 |  SORT AGGREGATE    |      |     1 |            |          |
|   2 |   TABLE ACCESS FULL| T    | 10100 |     8   (0)| 00:00:01 |
-------------------------------------------------------------------

14 rows selected.

SQL> delete from t where no is null;

100 rows deleted.

SQL> commit;

Commit complete.

SQL> alter table t modify no not null;

Table altered.

SQL> exec dbms_stats.gather_table_stats(user, 'T', cascade=>true)

PL/SQL procedure successfully completed.

SQL> select count(*) from t;

  COUNT(*)
----------
     10000

SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------
SQL_ID  cyzznbykb509s, child number 0
-------------------------------------
select count(*) from t
Plan hash value: 2361957729
------------------------------------------------------------------
| Id  | Operation        | Name  | Rows  | Cost (%CPU)| Time     |
------------------------------------------------------------------
|   0 | SELECT STATEMENT |       |       |     7 (100)|          |
|   1 |  SORT AGGREGATE  |       |     1 |            |          |
|   2 |   INDEX FULL SCAN| T_IDX | 10000 |     7   (0)| 00:00:01 |
------------------------------------------------------------------

14 rows selected.

SQL>

1 person found this helpful

BluShadow

Agree with John and Karthick.

However if the selectivity of the data is suitable, you may consider a bitmap index, which does support null values.

Read up on bitmap indexes first though to understand the benefits and pitfalls of using such.

PuraVidaOTN

Hello -

In your code you did not put in the where clause

which shows that the query uses full table scan, but if you put in, then it does uses index

in both cases. But using index may not result better performance if not selective.

SQL> drop table t;

Table dropped.

SQL>

SQL> create table t

2 as

3 select level no, 'xxx' name

4 from dual

5 connect by level <= 10000;

Table created.

SQL>

SQL> insert into t

2 select null, 'xxx'

3 from dual

4 connect by level <= 100;

100 rows created.

SQL>

SQL> commit;

Commit complete.

SQL>

SQL> create index t_idx on t(no);

Index created.

SQL>

SQL> exec dbms_stats.gather_table_stats(user, 'T', cascade=>true)

PL/SQL procedure successfully completed.

SQL>

SQL> select count(*) from t where no is not null;

COUNT(*)

----------

10000

SQL>

SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT

--------------------------------------------------------------------------------

SQL_ID f6gf49ktv5y8s, child number 0

-------------------------------------

select count(*) from t where no is not null

Plan hash value: 1058879072

-------------------------------------------------------------------------------

-------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | | | 7 (100)| |

| 1 | SORT AGGREGATE | | 1 | 4 | | |

PLAN_TABLE_OUTPUT

--------------------------------------------------------------------------------

|* 2 | INDEX FAST FULL SCAN| T_IDX | 10000 | 40000 | 7 (0)| 00:00:01 |

-------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

2 - filter("NO" IS NOT NULL)

19 rows selected.

SQL> drop table t;

Table dropped.

SQL>

SQL> create table t

2 as

3 select level no, 'xxx' name

4 from dual

5 connect by level <= 10;

Table created.

SQL>

SQL> insert into t

2 select null, 'xxx'

3 from dual

4 connect by level <= 10000;

10000 rows created.

SQL>

SQL> commit;

Commit complete.

SQL>

SQL> create index t_idx on t(no);

Index created.

SQL>

SQL> exec dbms_stats.gather_table_stats(user, 'T', cascade=>true)

PL/SQL procedure successfully completed.

SQL>

SQL> select count(*) from t where no is not null;

COUNT(*)

----------

SQL>

SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT

--------------------------------------------------------------------------------

SQL_ID f6gf49ktv5y8s, child number 0

-------------------------------------

select count(*) from t where no is not null

Plan hash value: 995313729

--------------------------------------------------------------------------

--------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | | | 1 (100)| |

| 1 | SORT AGGREGATE | | 1 | 2 | | |

PLAN_TABLE_OUTPUT

--------------------------------------------------------------------------------

|* 2 | INDEX FULL SCAN| T_IDX | 10 | 20 | 1 (0)| 00:00:01 |

--------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

2 - filter("NO" IS NOT NULL)

19 rows selected.

1 person found this helpful

PuraVidaOTN

I agree with John, but not with Karthick who said:

Why do you want to index that column. That is the question you need to be asking.

One basic thing is oracle does not index null values.

That Oracle does not index null values do not mean you cannot find what is not null from the index

which is what I asked in original post. Specifically, if I have 10100 rows, 100 rows is not null and rest

are nulls. The index has only 100 rows. Then this query "select * from my_table where the_column is not null"

will only needs to go to the index to find the 100 rows, and this is demonstrated with an example in

the previous post. Is this not true? Thanks.

1 - 5

Locked Post

New comments cannot be posted to this locked post.

Locked on Nov 22 2013

Added on Oct 25 2013

#index, #not

5 comments

4,208 views

SQL & PL/SQL

What index to use for "where col is IS NOT NULL" query

Comments

Post Details