This discussion is archived
10 Replies Latest reply: Apr 25, 2012 11:21 PM by Jonathan Lewis RSS

10046 trace confused

Kramer Newbie
Currently Being Moderated
=====================
PARSING IN CURSOR #2 len=657 dep=0 uid=358 oct=3 lid=358 tim=1182070262728 hv=2222393979 ad='de83b938'
SELECT v_user.pm_logon_id,v_user.home_department_c, v_user.ic_n, v_user.staff_n, v_user.roster_scheme_n, v_user.meal_pref_c, v_user.work_department_c,v_user.
meal_payment_i, v_user.end_dt, v_user.ic_n as staff_work_ic_n
FROM v_ccs_op_user v_user
WHERE (v_user.end_dt IS NULL OR (TRUNC(v_user.end_dt, 'dd')>=TRUNC(to_date('2012-04-05 00:00:00','YYYY-MM-DD HH24:MI:SS'), 'dd'))) AND TRUNC(v_user.start_dt,
'dd')<=TRUNC(to_date('2012-04-05 00:00:00','YYYY-MM-DD HH24:MI:SS'), 'dd')
AND (v_user.HOME_DEPARTMENT_C IN (SELECT d.department_c FROM department d WHERE d.DIVISION_C='C') OR v_user.home_department_c IN ('Q'))
ORDER BY v_user.end_dt, v_user.start_dt ASC
END OF STMT
PARSE #2:c=0,e=4764,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=1182070262719
BINDS #2:
EXEC #2:c=0,e=255,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=1182070263378
WAIT #2: nam='SQL*Net message to client' ela= 5 driver id=1650815232 #bytes=1 p3=0 obj#=258 tim=1182070263514
*** 2012-04-24 15:41:49.623

FETCH #2:c=306350000,e=299174653,p=0,cr=22298,cu=0,mis=0,r=1,dep=0,og=1,tim=1182369438322

WAIT #2: nam='SQL*Net message from client' ela= 1454 driver id=1650815232 #bytes=1 p3=0 obj#=258 tim=1182369440799
WAIT #2: nam='SQL*Net message to client' ela= 4 driver id=1650815232 #bytes=1 p3=0 obj#=258 tim=1182369441012
FETCH #2:c=0,e=293,p=0,cr=0,cu=0,mis=0,r=15,dep=0,og=1,tim=1182369441265
WAIT #2: nam='SQL*Net message from client' ela= 7437 driver id=1650815232 #bytes=1 p3=0 obj#=258 tim=1182369448858
WAIT #2: nam='SQL*Net message to client' ela= 3 driver id=1650815232 #bytes=1 p3=0 obj#=258 tim=1182369449007
FETCH #2:c=0,e=283,p=0,cr=0,cu=0,mis=0,r=15,dep=0,og=1,tim=1182369449257
WAIT #2: nam='SQL*Net message from client' ela= 6701 driver id=1650815232 #bytes=1 p3=0 obj#=258 tim=1182369456082
WAIT #2: nam='SQL*Net message to client' ela= 2 driver id=1650815232 #bytes=1 p3=0 obj#=258 tim=1182369456222


-----------

I was using 10046 to trace one SQL. I used command "tail -f" to monitor the raw trace file while running the SQL. During the monitoring, I found that most of the time was spend at this step FETCH #2:c=306350000,e=299174653,p=0,cr=22298,cu=0,mis=0,r=1,dep=0,og=1,tim=1182369438322. It is high lightened on the above output.
My question is
1. what does e=299174653 mean? is it meaning the elapsed time for the fetch step or the previous step which is EXEC ?

2. if the e=299174653 means elapsed time for the current fetch step, then why this fetch cost so much of time? what is the different between this first fetch and other fetches?

3. what is the exec and fetch means here? I thought the fetch is merely reading data from buffer or data file, and exec is some other actions. but seems my understanding is not correct.

4. is there any docs or something can help to understand the trace file?


thank you very much in advance :)

Edited by: Kramer on 2012-4-24 下午7:51
  • 1. Re: 10046 trace confused
    Mark Malakanov (user11181920) Expert
    Currently Being Moderated
    you better use tkprof to process, sort, compress and beautify trace file.
  • 2. Re: 10046 trace confused
    sb92075 Guru
    Currently Being Moderated
  • 3. Re: 10046 trace confused
    Tubby Guru
    Currently Being Moderated
    If you're interested in reading raw trace files you may want to delve in to some of Cary Millsap's material.

    http://method-r.com/papers
  • 4. Re: 10046 trace confused
    871018 Explorer
    Currently Being Moderated
    Hi Kramer,

    Oracle does almost nothing for queries on EXEC phase, only some minor steps as fix SCN for a query. It tries to maximally defer the major work.
    DML statements and "select for update" are done on EXEC phase.
    Your query contains ORDER BY, so Oracle scans your data and puts it into a workarea, sorts it, this is done during the first fetch. Another fetches will get sorted data from workarea.
    Thus, it is expected that the first fetch is long and heavy (and contains all logical reads of the query, cpu work, etc) and another fetches will are short and lite in your case.


    Alexander Anokhin
    http://alexanderanokhin.wordpress.com/
  • 5. Re: 10046 trace confused
    Paulie Pro
    Currently Being Moderated
    >
    4. is there any docs or something can help to understand the trace file?
    Firstly, always tell us your version of Oracle - SELECT * FROM V$Version;


    You are lucky in that there is a truck load of stuff - unlucky in that it's complex and will take time.
    Can you tell us what your goal is?

    The Oracle documentation - TKPROF http://docs.oracle.com/cd/B10500_01/server.920/a96533/sqltrace.htm
    Cary Millsap's "Optimizing Oracle Performance" - very detailled - considered the reference work.
    A good introduction is Christopher Lawson's "The Art and Science of Oracle Performance Tuning"
    You might also want to look at
    http://www.orafaq.com/wiki/TKProf
    http://www.orafaq.com/wiki/SQL_Trace
    http://oracledba.ru/orasrp/ - a tool like TKPROF (free)
    http://www.oracle-base.com/articles/8i/tkprof-and-oracle-trace.php
    http://www.dba-oracle.com/t_10046_trace_file_parse_execute.htm
    http://www.adp-gmbh.ch/ora/misc/trace_files.html

    Take a look - it's not for the fainthearted!

    Paul...
  • 6. Re: 10046 trace confused
    CharlesHooper Expert
    Currently Being Moderated
    Kramer,

    That certainly is an interesting 10046 trace. The trace file claims that the session's process was on the CPU for 306 seconds in a 299 second elapsed time, without any wait events in that 299 seconds, and with only 22,298 consistent gets. Very strange, almost as if one of the following happened:
    * The session was spinning on a latch that could not be obtained. The fact that the session did not post a latch related wait event would mean that either the SPINCOUNT parameter was modified to an extremely high value, or that the session encountered a bug in the code that resulted in a nearly infinite loop.
    * Auditing is enabled, and something went wrong in the audit code.
    * The session's process was suspended at the operating system level, although the CPU time probably would not closely equal the elapsed time if that happened.
    * The session's process was in the CPU run queue for 299 seconds - some operating systems will continue to accumulate CPU time for processes even when those processes are sitting in the run queue. It seems very unlikely that this is what happened, unless someone had used something such as the nice OS utility and there was a severe shortage of CPU availability. The tim= time stamps closely agree with the reported wait event elapsed times, so that fact might decrease the chances that the process spent a lot of time in the run queue due to excessive competition for the CPU.
    * The server's clock was resynchronized with an external time source. I could see this affecting the reported elapsed time, but not the CPU time.
    * Memory is over-allocated on the server, which resulted in at least a portion of the buffer cache being swapped to the swap file. It seems unreasonable that it would take nearly 5 minutes to perform a memory swap operation.

    From the trace file, it appears that you are running an Oracle Database version between 10.1 and 10.2.0.5. Exactly which version are you running, and what is your operating system version? Was the session unresponsive for 5 minutes? Were any other sessions unresponsive for roughly the same amount of time? Is this Oracle Database installed on a virtual machine? Are you able to list all of the modified Oracle parameters?

    Charles Hooper
    http://hoopercharles.wordpress.com/
    IT Manager/Oracle DBA
    K&M Machine-Fabricating, Inc.
  • 7. Re: 10046 trace confused
    Jonathan Lewis Oracle ACE Director
    Currently Being Moderated
    Charles Hooper wrote:

    The trace file claims that the session's process was on the CPU for 306 seconds in a 299 second elapsed time, without any wait events in that 299 seconds, and with only 22,298 consistent gets. Very strange, almost as if one of the following happened:
    If you multiply the elapsed time by 1.024 the answer is almost exactly the same as the CPU time - don't ask why Oracle should ever think there are 1024 milliseconds to a second - but it is an odd phenomenon scattered around Oracle's history.

    If we assume it's a strange arithmetic error, then a query with an IN subquery that might have to do a tablescan of small number of blocks for every single row accessed might clock up a lot of CPU accessing blocks that are all buffered - and you have to note that there are a lot of type conversion calls going on as well.

    Regards
    Jonathan Lewis
  • 8. Re: 10046 trace confused
    Paulie Pro
    Currently Being Moderated
    >

    If you multiply the elapsed time by 1.024 the answer is almost exactly the same
    as the CPU time - don't ask why Oracle should ever think there are 1024 milliseconds
    to a second - but it is an odd phenomenon scattered around Oracle's history.
    They're in good company - from here http://www-cs-faculty.stanford.edu/~knuth/faq.html,
    Donald Knuth pays 1 hexadecimal dollar 0x$1.00 ($2.56) to those who discover flaws in his books.


    Paul...

    Jonathan Lewis
  • 9. Re: 10046 trace confused
    CharlesHooper Expert
    Currently Being Moderated
    Jonathan Lewis wrote:
    If you multiply the elapsed time by 1.024 the answer is almost exactly the same as the CPU time - don't ask why Oracle should ever think there are 1024 milliseconds to a second - but it is an odd phenomenon scattered around Oracle's history.
    I did not spend very much time trying to determine why the elapsed time was different than the reported CPU time - one thought was that the difference was just an odd accounting of the CPU time at the operating system level, where the process that happened to be on the CPU at the moment that the OS updated the CPU usage statistics was credited with the full time slice of CPU usage (there are better words to describe this CPU utilization accumulation oddity, but the words escape me at the moment). Now that you pointed out the 1000 vs 1024 time scaling difference, I think that you are absolutely correct (I think that Cary Millsap mentioned this behavior in his book).
    If we assume it's a strange arithmetic error, then a query with an IN subquery that might have to do a tablescan of small number of blocks for every single row accessed might clock up a lot of CPU accessing blocks that are all buffered - and you have to note that there are a lot of type conversion calls going on as well.
    I struggled with this a bit, considering that there were only 22,298 reported consistent gets while burning roughly 5 minutes of CPU utilization. I read somewhere that there is an optimization for IN list subqueries that allows the optimizer to cache the results of the subquery so that it does not need to be re-evaluated for each row. I cannot seem to find a note about this concept in the documentation for Oracle Database, but there is a note in the book "Oracle® Database Lite Troubleshooting and Tuning Guide 10g":
    http://docs.oracle.com/cd/B28928_01/doc/server.103/b28927/performance.htm#sthref139
    "If the optimizer determines that the number of rows returned by a subquery is small and the query is non-correlated, then the query result is cached in memory for better performance"

    Of course, the IN list subquery could have been transformed by the optimizer into a regular join, so how to account for just 22,298 consistent gets? Do you think that it is possible that the runtime engine would pin the majority of the accessed blocks, which would increment the buffer is pinned count statistic every time the buffer was re-accessed, rather than incrementing the consistent gets statistic?:
    http://books.google.com/books?id=G9AJA91PL54C&pg=PA114

    Could the problem be caused by a delayed block cleanout with a slow log writer (I think that I saw a bug report on MOS related to redo logs over NFS, but I cannot find the Doc ID at the moment), but that explanation seems unlikely.

    Other possibilies from MOS, depending on operating system and Oracle Database version:
    Doc ID 433631.1: "Mutex Latch Spin Causes High Cpu on Non-CAS Platforms (HP-UX PA-RISC)"
    Bug 5399325: "MUTEX LATCH SPIN CAUSES HIGH CPU ON NON-CAS PLATFORMS"
    Bug 4926618: "Excessive CPU on HASH UNIQUE when repeated"
    Doc ID 3216002.8, Bug 3216002: "HASH JOIN ANTI / SEMI can use a lot of CPU / spin"
    Doc ID 752108.1: "ORA-04031 errors lead to high CPU usage and DB Hang"

    Charles Hooper
    http://hoopercharles.wordpress.com/
    IT Manager/Oracle DBA
    K&M Machine-Fabricating, Inc.
  • 10. Re: 10046 trace confused
    Jonathan Lewis Oracle ACE Director
    Currently Being Moderated
    Charles Hooper wrote:
    Jonathan Lewis wrote:
    If you multiply the elapsed time by 1.024 the answer is almost exactly the same as the CPU time - don't ask why Oracle should ever think there are 1024 milliseconds to a second - but it is an odd phenomenon scattered around Oracle's history.
    I did not spend very much time trying to determine why the elapsed time was different than the reported CPU time - one thought was that the difference was just an odd accounting of the CPU time at the operating system level, where the process that happened to be on the CPU at the moment that the OS updated the CPU usage statistics was credited with the full time slice of CPU usage (there are better words to describe this CPU utilization accumulation oddity, but the words escape me at the moment). Now that you pointed out the 1000 vs 1024 time scaling difference, I think that you are absolutely correct (I think that Cary Millsap mentioned this behavior in his book).
    I usually tend to assume that any excess of CPU over Elapsed is likely to be related to the difference in accounting - but when there are no waits and it still appears then I think is has to be the arithmetic. I'd guess that this O/S is recording elapsed time in nanoseconds and Oracle is right-shifting to get to micro-seconds instead of dividing properly. It would be possible to cross check if there were a couple of the ten-second time stamps in the trace file with "tim=" entries sufficiencly close by that we could compare the change in "tim=" with the change in the timestamp.
    If we assume it's a strange arithmetic error, then a query with an IN subquery that might have to do a tablescan of small number of blocks for every single row accessed might clock up a lot of CPU accessing blocks that are all buffered - and you have to note that there are a lot of type conversion calls going on as well.
    I struggled with this a bit, considering that there were only 22,298 reported consistent gets while burning roughly 5 minutes of CPU utilization.
    I was simply trying to think of a method that allowed you to visit a small number of blocks but do a large amount of work on each visit - viz: look at every single row in the block and do something CPU-intensive with it.
    I read somewhere that there is an optimization for IN list subqueries that allows the optimizer to cache the results of the subquery so that it does not need to be re-evaluated for each row. I cannot seem to find a note about this concept in the documentation for Oracle Database, but there is a note in the book "Oracle® Database Lite Troubleshooting and Tuning Guide 10g":
    http://docs.oracle.com/cd/B28928_01/doc/server.103/b28927/performance.htm#sthref139
    "If the optimizer determines that the number of rows returned by a subquery is small and the query is non-correlated, then the query result is cached in memory for better performance"
    That comment in the manuals looks like a bad description of Scalar Subquery Caching - http://jonathanlewis.wordpress.com/2006/11/06/filter-subqueries/ (the example shown in the documentation would either unnest the subquery or take advantage of scalar subquery caching). If that's happning it is more likely to reduce CPU than cause a large amount of it to be consumed. On the other hand (and we do need to know the execution plan) it's possible that an 11g "result cache" mechanism is doing something very inefficient with a large cached result).
    Of course, the IN list subquery could have been transformed by the optimizer into a regular join, so how to account for just 22,298 consistent gets? Do you think that it is possible that the runtime engine would pin the majority of the accessed blocks, which would increment the buffer is pinned count statistic every time the buffer was re-accessed, rather than incrementing the consistent gets statistic?:
    http://books.google.com/books?id=G9AJA91PL54C&pg=PA114
    Since we don't know what the plan is (and the v_ name for an object might indicate that it's a view, which could be hiding a multitude of sins) I think this sounds more likely than my "working hard in a few blocks" bypothesis. Given an unlucky nested loop join you could do a huge amount of work revisitng the same small numbiier of pinned blocks a very large number of times.

    Adding to your list of "maybe's" (and trying to avoid suggestions of bugs) - if there's a btree/bitmap conversion involved then it's possible to use a lot of CPU while visitng a small number of blocks.

    Regards
    Jonathan Lewis

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points