Skip to Main Content

SQL & PL/SQL

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Function call overhead

657156Sep 25 2009 — edited Sep 25 2009
This is 11g on Linux, but reproduceable on all platforms.

I think I am hitting a performance wall due to overhead of calling a function for every row in a resultset, wondering if anyone has a neat way around this. The simplest example is a function that does nothing, then compare the times. ie:


#
# Create function that does nothing, we are really just measuring the function call overhead
#
SQL> create or replace function do_nothing(x varchar2) return varchar2 is begin return x; end do_nothing;
/

Function created.

#
# Build a bogus table of strings to test
#
SQL> create table test (val varchar2(100));
SQL> begin
for i in 1..1000000 loop
insert into test (val) values (dbms_random.string('A',10));
end loop;
commit;
end;
/


#
# now time, comparing time diffs from no function versus with function
#
SQL> select max(val) from test;

MAX(VAL)
==========
zzzxgZcdxk

Elapsed: 00:00:00.06
SQL> select max(do_nothing(val)) from test;

MAX(DO_NOTHING(VAL))
================
zzzxgZcdxk

Elapsed: 00:00:03.59




So, is there any way to do this faster? I have tried doing it with java stored proc, same exact time, so I suspect that the time is just overhead of making in this case the 1million function calls.


And for extra credit, maybe I am doing something silly in the first place but the reason we are doing this is to take oracle timestamps and convert them into doubles representing "days from epoch", including sub-day precision with timezone conversion. Is there a way I can do this without a function call? Here is the real function call I am using, :

CREATE OR REPLACE FUNCTION DAYS_FROM_EPOCH (p_d1 timestamp,p_targetTZ varchar2 default 'GMT')
RETURN NUMBER is
l_epoch timestamp;
l_diff interval day(9) to second(9);
BEGIN

l_epoch := to_date('1/1/1970','mm/dd/yyyy');
l_diff := from_tz(p_d1,'GMT')-from_tz(l_epoch,p_targetTZ);
return extract(day from l_diff) + extract(hour from l_diff)/24 + extract(minute from l_diff)/24/60 + extract(second from l_diff)/24/60/60;

END DAYS_FROM_EPOCH;
/



used like:

SQL> select days_from_epoch(systimestamp) from dual;

DAYS_FROM_EPOCH(SYSTIMESTAMP)
=======================
14512.4224



Thanks for your help!

Comments

fsitja
One thing you can do is mark the function result to be cached:
CREATE OR REPLACE FUNCTION DAYS_FROM_EPOCH (p_d1 timestamp,p_targetTZ varchar2 default 'GMT')
RETURN NUMBER RESULT_CACHE is -- <= *HERE*
...
END DAYS_FROM_EPOCH;
then use a result_cache hint in your SQL
select /*+ RESULT_CACHE */ days_from_epoch(systimestamp) from dual;
http://download.oracle.com/docs/cd/E11882_01/server.112/e10821/memory.htm#PFGRF982

edit: I remember seeing somewhere the deterministic keyword now also correctly optmizes performance in Oracle 11g.

http://download.oracle.com/docs/cd/E11882_01/appdev.112/e10471/adfns_packages.htm#ADFNS386

Edited by: fsitja on Sep 25, 2009 1:31 PM
657156
Thanks for the reply. That would help if the function itself made a query, or was heavier-weight than our simple example, but performance actually gets much worse when applying that to the simple test I outlined. The problem is that I think evaluation of any of this stuff (be it deterministic, cache etc) is all managed by the function itself, so oracle has to stop resultset processing and do all this other stuff. Witness:

SQL> create or replace function do_nothing(x varchar2) return varchar2 result_cache relies_on (test) is begin return x; end do_nothing;
/


SQL> select max(do_nothing(val)) from test;

MAX(DO_NOTHING(VAL))
==================
zzzxgZcdxk

Elapsed: 00:02:48.39
SQL> select max(val) from test;

MAX(VAL)
===============
zzzxgZcdxk

Elapsed: 00:00:00.06



The deterministic keyword does not make it any better, although at least we are back to the original 3.5 sec time versus the truly awul 3-minute time with result_cache
fsitja
I don't have access to my DB atm and I can't test it myself, so I'm just throwing it out there, but have you tried using the hint?

It seems to me caching the result of the query itself would be better than caching the function return in this case, given you're running it several times and it's a light-weight processing function.
657156
Yeah, I tried it with the hint too:

SQL> select /*+ RESULT_CACHE */ max(do_nothing(val)) from test;

MAX(DO_NOTHING(VAL))
--------------------
zzzxgZcdxk

Elapsed: 00:02:47.78



The obvious solution here (at least to me) is to do this in application space, where its a single function call that iterates over an array. Doing this in java takes a couple millis, much faster. But there ought to be a way to make this fast on the db side!

Query caching will be of minimal help I think as the underlying data sees frequent updates.

Thanks for the ideas though
21205
is there any particular reason why you want to convert a timestamp to epoch ?
JustinCave
Depending on how exactly you are using the function, one potential option is to create a function-based index (you could accomplish the same thing with a virtual column) so that the conversion to an epoch was done when the row was inserted into the table rather than at query time. That adds index maintenance overhead to the INSERT but makes retrieving the data trivial.
SQL> ed
Wrote file afiedt.buf

  1  create or replace function do_nothing(x varchar2)
  2    return varchar2
  3    deterministic
  4  is
  5  begin
  6    return x;
  7* end do_nothing;
SQL> /

Function created.

Elapsed: 00:00:00.36
SQL> create index idx_fbi
  2  on test( do_nothing( val ) );

Index created.

Elapsed: 00:00:16.51
SQL> select max(do_nothing(val)) from test;

MAX(DO_NOTHING(VAL))
-------------------------------------------------------------------------

zzzukpZHUt

Elapsed: 00:00:00.06
Since the majority of the overhead is likely the result of SQL to PL/SQL context shifts, if you are writing PL/SQL, you should be able to use collections to reduce the number of context shifts that occur. Ideally, you could do a BULK COLLECT (with a LIMIT) into a collection and the function would take a collection of dates and return a collection of numbers.

Justin
1 - 6
Locked Post
New comments cannot be posted to this locked post.

Post Details

Locked on Oct 23 2009
Added on Sep 25 2009
6 comments
1,193 views