Forum Stats

• 3,728,702 Users
• 2,245,676 Discussions

Discussions

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

peformance:nvl( my_column, 'N') <> 'Y' VS nvl( my_column, 'N') = 'N'

Member Posts: 26
edited March 2014

Which one gives better performance? Could please explain.

1)nvl( my_column, 'N') <> 'Y'

2) nvl( my_column, 'N') = 'N'

--

Thanks

Peddi

• Member, Moderator Posts: 40,327 Red Diamond
edited March 2014 Accepted Answer

Hi, Peddi,

The comparison will be equally efficient, whether you use = or <>.

• Member Posts: 18,288 Black Diamond

Neither. You have expression against my_column in both cases. Therefore index (assuming my_column is indexed) can't be used. However, if you create FBI on nvl( my_column, 'N'), then nvl( my_column, 'N') = 'N' will perform better assuming  my_column can have values other that N, Y or NULL. Otherwise that performance difference will be very close to 0.

SY.

• Member, Moderator Posts: 40,327 Red Diamond

Hi,

```Peddi(user576892) wrote:

Which one gives better performance? Could please explain.
1)nvl( my_column, 'N') <> 'Y'
2) nvl( my_column, 'N') = 'N'
--
Thanks
Peddi
```

Performance will be the same.

The results will be different if my_column is anything except 'Y', 'N' or NULL.

• Member Posts: 972
edited March 2014

Well, no reason to believe it would be any different, so let's do a quick test case

(and yes, results support what Solomon and Frank already pointed out ... ie no difference)

Setup a table to test - with some fake data.

```create table junk (
id  number,
dd  date,
cflag  char(1)
)
/
insert into junk
select level id,
sysdate - floor(dbms_random.value(1,1000)) dd,
case mod(floor(dbms_random.value(1,100)),2) when 0 then 'Y' else 'N' end cflag
from dual
connect by level <= 1000000
/
commit;

```

start trace ..

Query 1

```set autotrace traceonly

select * from junk
where nvl(cflag,'N') <> 'Y'
/

```

Results: Query 1

```505298 rows selected.

Elapsed: 00:00:49.77

Execution Plan
----------------------------------------------------------
Plan hash value: 3017102790

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   563K|    13M|   768   (3)| 00:00:10 |
|*  1 |  TABLE ACCESS FULL| JUNK |   563K|    13M|   768   (3)| 00:00:10 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(NVL("CFLAG",'N')<>'Y')

Note
-----
- dynamic sampling used for this statement (level=2)

Statistics
----------------------------------------------------------
29  recursive calls
1  db block gets
36492  consistent gets
176  redo size
10895874  bytes sent via SQL*Net to client
236165  bytes received via SQL*Net from client
33688  SQL*Net roundtrips to/from client
0  sorts (memory)
0  sorts (disk)
505298  rows processed

```

Query 2

```select * from junk
where nvl(cflag,'N') = 'N'
/

```

Results: Query 1

```505298 rows selected.

Elapsed: 00:00:49.76

Execution Plan
----------------------------------------------------------
Plan hash value: 3017102790

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   563K|    13M|   768   (3)| 00:00:10 |
|*  1 |  TABLE ACCESS FULL| JUNK |   563K|    13M|   768   (3)| 00:00:10 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(NVL("CFLAG",'N')='N')

Note
-----
- dynamic sampling used for this statement (level=2)

Statistics
----------------------------------------------------------
5  recursive calls
0  db block gets
36447  consistent gets
0  redo size
10895838  bytes sent via SQL*Net to client
236165  bytes received via SQL*Net from client
33688  SQL*Net roundtrips to/from client
0  sorts (memory)
0  sorts (disk)
505298  rows processed

```

Now at this point, the 2nd query looks a bit better (ie redo size), however, that's often because Oracle still has things cached.

So we can either clear the cache, or just re-run the first and see how it compares.

(I'm feeling lazy, so I'll re-run the first query again)

Query 1

```select * from junk
where nvl(cflag,'N') <> 'Y'
/

```

Results: Query 1 (take 2)

```505298 rows selected.

Elapsed: 00:00:49.78

Execution Plan
----------------------------------------------------------
Plan hash value: 3017102790

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   563K|    13M|   768   (3)| 00:00:10 |
|*  1 |  TABLE ACCESS FULL| JUNK |   563K|    13M|   768   (3)| 00:00:10 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(NVL("CFLAG",'N')<>'Y')

Note
-----
- dynamic sampling used for this statement (level=2)

Statistics
----------------------------------------------------------
0  recursive calls
0  db block gets
36370  consistent gets
0  redo size
10896095  bytes sent via SQL*Net to client
236165  bytes received via SQL*Net from client
33688  SQL*Net roundtrips to/from client
0  sorts (memory)
0  sorts (disk)
505298  rows processed

```

And suddenly they look very very similar.

As already mentioned, I can't see any significant difference in performance with that.

Unless you have some other additional criteria or conditions, or something ...

• Member Posts: 26

Thanks for your responses.

Possible Values of my_columns are only 3 ( i ) 'Y'  ( ii )  'N' ( iii ) NULL

1) nvl( my_column, 'N') <> 'Y'

2) nvl( my_column, 'N') = 'N'

Both gives same performance?

If not, could you please explain the reason for the better performance option.

In other words, does the = operator give better performance than <> operator with strings?

--

Thanks

Peddi

• Member, Moderator Posts: 40,327 Red Diamond
edited March 2014 Accepted Answer

Hi, Peddi,

The comparison will be equally efficient, whether you use = or <>.

• Member Posts: 26

Many thanks for the practical results.

• Member Posts: 9,587 Gold Crown

The answer depends on how much of the data is null, N or Y.

If you have a large data set and a tiny amount of data at N or NULL then you can create a function-based index on nvl(mycolumn,'N'), create a histogram on the indexed column, and the predicate nvl(mycolumn,'N')  = 'N' will be able to use the index.

Better yet in those circumstances to use a CASE expression to convert everything else to NULL, (and change the predicate accordingly), then you have an index only on the rows you want and don't need a histogram.

Regards

Jonathan Lewis

• Member Posts: 16,076 Gold Crown

How about creating the index as follows: create index t_idx on t (my_column, 'N'); ?

Then would we still need an NVL in the query and/or the function based index?

(I'm without a database to test this currently, hence my question.)

This discussion has been closed.