Forum Stats

  • 3,728,707 Users
  • 2,245,678 Discussions
  • 7,853,707 Comments

Discussions

peformance:nvl( my_column, 'N') <> 'Y' VS nvl( my_column, 'N') = 'N'

Peddi-User576892-Oracle
Peddi-User576892-Oracle Member Posts: 26
edited March 2014 in SQL & PL/SQL

Which one gives better performance? Could please explain.

1)nvl( my_column, 'N') <> 'Y'

2) nvl( my_column, 'N') = 'N'

--

Thanks

Peddi

Best Answer

Answers

  • Solomon Yakobson
    Solomon Yakobson Member Posts: 18,288 Black Diamond

    Neither. You have expression against my_column in both cases. Therefore index (assuming my_column is indexed) can't be used. However, if you create FBI on nvl( my_column, 'N'), then nvl( my_column, 'N') = 'N' will perform better assuming  my_column can have values other that N, Y or NULL. Otherwise that performance difference will be very close to 0.

    SY.

  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 40,327 Red Diamond

    Hi,

    Peddi(user576892) wrote:
    
    Which one gives better performance? Could please explain.
    1)nvl( my_column, 'N') <> 'Y' 
    2) nvl( my_column, 'N') = 'N' 
    --
    Thanks
    Peddi
    

    Performance will be the same.

    The results will be different if my_column is anything except 'Y', 'N' or NULL.

  • Greg Spall
    Greg Spall Member Posts: 972
    edited March 2014

    Well, no reason to believe it would be any different, so let's do a quick test case

    (and yes, results support what Solomon and Frank already pointed out ... ie no difference)

    Setup a table to test - with some fake data.

    create table junk (
       id  number,
       dd  date,
       cflag  char(1)
       )
    /
    
    insert into junk select level id,        sysdate - floor(dbms_random.value(1,1000)) dd,        case mod(floor(dbms_random.value(1,100)),2) when 0 then 'Y' else 'N' end cflag    from dual   connect by level <= 1000000 /
    commit;

    start trace ..

    Query 1

    set autotrace traceonly
    
    select * from junk
    where nvl(cflag,'N') <> 'Y'
    /
    
    

    Results: Query 1

    505298 rows selected.
    
    Elapsed: 00:00:49.77
    
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 3017102790
    
    --------------------------------------------------------------------------
    | Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
    --------------------------------------------------------------------------
    |   0 | SELECT STATEMENT  |      |   563K|    13M|   768   (3)| 00:00:10 |
    |*  1 |  TABLE ACCESS FULL| JUNK |   563K|    13M|   768   (3)| 00:00:10 |
    --------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       1 - filter(NVL("CFLAG",'N')<>'Y')
    
    Note
    -----
       - dynamic sampling used for this statement (level=2)
    
    Statistics
    ----------------------------------------------------------
             29  recursive calls
              1  db block gets
          36492  consistent gets
              0  physical reads
            176  redo size
       10895874  bytes sent via SQL*Net to client
         236165  bytes received via SQL*Net from client
          33688  SQL*Net roundtrips to/from client
              0  sorts (memory)
              0  sorts (disk)
         505298  rows processed
    
    

    Query 2

    select * from junk
    where nvl(cflag,'N') = 'N'
    /
    
    

    Results: Query 1

    505298 rows selected.
    
    Elapsed: 00:00:49.76
    
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 3017102790
    
    --------------------------------------------------------------------------
    | Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
    --------------------------------------------------------------------------
    |   0 | SELECT STATEMENT  |      |   563K|    13M|   768   (3)| 00:00:10 |
    |*  1 |  TABLE ACCESS FULL| JUNK |   563K|    13M|   768   (3)| 00:00:10 |
    --------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       1 - filter(NVL("CFLAG",'N')='N')
    
    Note
    -----
       - dynamic sampling used for this statement (level=2)
    
    Statistics
    ----------------------------------------------------------
              5  recursive calls
              0  db block gets
          36447  consistent gets
              0  physical reads
              0  redo size
       10895838  bytes sent via SQL*Net to client
         236165  bytes received via SQL*Net from client
          33688  SQL*Net roundtrips to/from client
              0  sorts (memory)
              0  sorts (disk)
         505298  rows processed
    
    

    Now at this point, the 2nd query looks a bit better (ie redo size), however, that's often because Oracle still has things cached.

    So we can either clear the cache, or just re-run the first and see how it compares.

    (I'm feeling lazy, so I'll re-run the first query again)

    Query 1

    select * from junk
    where nvl(cflag,'N') <> 'Y'
    /
    
    

    Results: Query 1 (take 2)

    505298 rows selected.
    
    Elapsed: 00:00:49.78
    
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 3017102790
    
    --------------------------------------------------------------------------
    | Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
    --------------------------------------------------------------------------
    |   0 | SELECT STATEMENT  |      |   563K|    13M|   768   (3)| 00:00:10 |
    |*  1 |  TABLE ACCESS FULL| JUNK |   563K|    13M|   768   (3)| 00:00:10 |
    --------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       1 - filter(NVL("CFLAG",'N')<>'Y')
    
    Note
    -----
       - dynamic sampling used for this statement (level=2)
    
    Statistics
    ----------------------------------------------------------
              0  recursive calls
              0  db block gets
          36370  consistent gets
              0  physical reads
              0  redo size
       10896095  bytes sent via SQL*Net to client
         236165  bytes received via SQL*Net from client
          33688  SQL*Net roundtrips to/from client
              0  sorts (memory)
              0  sorts (disk)
         505298  rows processed
    
    

    And suddenly they look very very similar.

    As already mentioned, I can't see any significant difference in performance with that.

    Unless you have some other additional criteria or conditions, or something ...

  • Thanks for your responses.

    Possible Values of my_columns are only 3 ( i ) 'Y'  ( ii )  'N' ( iii ) NULL

    1) nvl( my_column, 'N') <> 'Y'

    2) nvl( my_column, 'N') = 'N'

    Both gives same performance?

    If not, could you please explain the reason for the better performance option.

    In other words, does the = operator give better performance than <> operator with strings?

    --

    Thanks

    Peddi

  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 40,327 Red Diamond
    edited March 2014 Accepted Answer

    Hi, Peddi,

    The comparison will be equally efficient, whether you use = or <>.

  • Many thanks for the practical results.

  • Jonathan Lewis
    Jonathan Lewis Member Posts: 9,587 Gold Crown

    The answer depends on how much of the data is null, N or Y.

    If you have a large data set and a tiny amount of data at N or NULL then you can create a function-based index on nvl(mycolumn,'N'), create a histogram on the indexed column, and the predicate nvl(mycolumn,'N')  = 'N' will be able to use the index.

    Better yet in those circumstances to use a CASE expression to convert everything else to NULL, (and change the predicate accordingly), then you have an index only on the rows you want and don't need a histogram.

    Regards

    Jonathan Lewis

  • Hoek
    Hoek Member Posts: 16,076 Gold Crown

    How about creating the index as follows: create index t_idx on t (my_column, 'N'); ?

    Then would we still need an NVL in the query and/or the function based index?

    (I'm without a database to test this currently, hence my question.)

This discussion has been closed.