This discussion is archived
5 Replies Latest reply: Feb 2, 2010 10:28 AM by CharlesHooper RSS

Anomaly in TKPROF; Encountered during Performance Tuning

711242 Newbie
Currently Being Moderated
A slow response JOB was traced using ORADEBUG 10046 trace event. The job ran for about 60K Seconds and tracing was offed after 45K second. Strangely I see that the total time spent on Non-Recursive SQL statements is mot more than 5000 Seconds wheras the tkprof o/p has 45000 seconds as in the last line of tkprof O/p.

I am surprised as to where the remaining time of 45080-5000 ~ *40000* Seconds go unaccounted. Please help me out figure this.
Is there any anomaly in TKPROF ?

Pelase refer the last few lines of the TKPROF O/P
********************************************************************************
OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse       34      0.07       0.05          0          0          0           0
Execute 949806    924.33    1010.25          0          0          0          34
Fetch   964106    264.38    2540.21     301146   18226522          0     5094481
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total   1913946   1188.78    3550.51     301146   18226522          0     5094515

Misses in library cache during parse: 5
Misses in library cache during execute: 5

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  db file sequential read                    301151        0.49       1598.40
  SQL*Net message to client                  964141        0.01          1.54
  SQL*Net message from client                964140      973.95       1720.04
  read by other session                      201948        0.98        705.53
  latch: cache buffers chains                 11496        0.01          1.40
  latch free                                     38        0.00          0.02
  cursor: pin S                                  27        0.01          0.19
  SQL*Net more data from client                  87        0.00          0.00
  wait list latch free                            1        0.01          0.01
  latch: object queue header operation            8        0.00          0.00


OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        7      0.00       0.00          0          0          0           0
Execute      7      0.00       0.00          0         21          0           0
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total       14      0.00       0.00          0         21          0           0

Misses in library cache during parse: 0

   45  user  SQL statements in session.
    7  internal SQL statements in session.
   52  SQL statements in session.
********************************************************************************
Trace file: s1p_ora_9761.trc
Trace file compatibility: 10.01.00
Sort options: exeela  fchela  prsela
       1  session in tracefile.
      45  user  SQL statements in trace file.
       7  internal SQL statements in trace file.
      52  SQL statements in trace file.
      46  unique SQL statements in trace file.
 71815238  lines in trace file.
   45080  elapsed seconds in trace file.
Regards,
Valli

Edited by: user4566776 on Feb 1, 2010 6:57 AM
  • 1. Re: Anomaly in TKPROF; Encountered during Performance Tuning
    CharlesHooper Expert
    Currently Being Moderated
    Please post the last 30 lines of the raw 10046 trace file.

    Making a wild guess here - it could be that a very long running query that performed 100% logical IO (100% on the CPU or stuck in the CPU queue) was executing for the majority of that time, and had not yet fetched its first row. If that is the case, the last 30 or so lines should include just the periodic output of the current server time.

    Charles Hooper
    Co-author of "Expert Oracle Practices: Oracle Database Administration from the Oak Table"
    http://hoopercharles.wordpress.com/
    IT Manager/Oracle DBA
    K&M Machine-Fabricating, Inc.
  • 2. Re: Anomaly in TKPROF; Encountered during Performance Tuning
    711242 Newbie
    Currently Being Moderated
    Here it is, More than 30 lines from the bottom. Please let me know if you figured any reason for the anomaly ?
    WAIT #69: nam='db file sequential read' ela= 10255 file#=253 block#=514924 blocks=1 obj#=211550 tim=52370728986939
    WAIT #69: nam='db file sequential read' ela= 23269 file#=260 block#=1202218 blocks=1 obj#=211554 tim=52370729010404
    WAIT #69: nam='db file sequential read' ela= 14643 file#=260 block#=1202511 blocks=1 obj#=211554 tim=52370729025179
    WAIT #69: nam='db file sequential read' ela= 4114 file#=265 block#=1033501 blocks=1 obj#=211554 tim=52370729029406
    WAIT #69: nam='db file sequential read' ela= 13199 file#=253 block#=516876 blocks=1 obj#=211550 tim=52370729042681
    WAIT #69: nam='db file sequential read' ela= 15608 file#=253 block#=517214 blocks=1 obj#=211550 tim=52370729058435
    WAIT #69: nam='db file sequential read' ela= 11327 file#=253 block#=517552 blocks=1 obj#=211550 tim=52370729069910
    WAIT #69: nam='db file sequential read' ela= 3025 file#=265 block#=1033566 blocks=1 obj#=211554 tim=52370729073032
    WAIT #69: nam='db file sequential read' ela= 9539 file#=255 block#=981553 blocks=1 obj#=211550 tim=52370729082732
    WAIT #69: nam='db file sequential read' ela= 13474 file#=257 block#=1259903 blocks=1 obj#=211550 tim=52370729096346
    WAIT #69: nam='db file sequential read' ela= 15618 file#=265 block#=1033723 blocks=1 obj#=211554 tim=52370729112106
    WAIT #69: nam='db file sequential read' ela= 7040 file#=265 block#=1033868 blocks=1 obj#=211554 tim=52370729119454
    WAIT #69: nam='db file sequential read' ela= 7995 file#=257 block#=1261555 blocks=1 obj#=211550 tim=52370729127527
    WAIT #69: nam='db file sequential read' ela= 9090 file#=257 block#=1261616 blocks=1 obj#=211550 tim=52370729136775
    WAIT #69: nam='db file sequential read' ela= 6459 file#=257 block#=1260091 blocks=1 obj#=211550 tim=52370729143360
    WAIT #69: nam='db file sequential read' ela= 14971 file#=265 block#=1033992 blocks=1 obj#=211554 tim=52370729158426
    WAIT #69: nam='db file sequential read' ela= 8528 file#=257 block#=1260350 blocks=1 obj#=211550 tim=52370729167077
    WAIT #69: nam='db file sequential read' ela= 17749 file#=257 block#=1262569 blocks=1 obj#=211550 tim=52370729185407
    WAIT #69: nam='db file sequential read' ela= 14060 file#=257 block#=1262585 blocks=1 obj#=211550 tim=52370729199789
    WAIT #69: nam='db file sequential read' ela= 5944 file#=265 block#=1034201 blocks=1 obj#=211554 tim=52370729205855
    WAIT #69: nam='db file sequential read' ela= 7822 file#=265 block#=1034081 blocks=1 obj#=211554 tim=52370729213801
    WAIT #69: nam='db file sequential read' ela= 9290 file#=257 block#=1260904 blocks=1 obj#=211550 tim=52370729223162
    WAIT #69: nam='db file sequential read' ela= 8096 file#=257 block#=1263630 blocks=1 obj#=211550 tim=52370729231435
    WAIT #69: nam='db file sequential read' ela= 17243 file#=266 block#=1164414 blocks=1 obj#=211554 tim=52370729248781
    WAIT #69: nam='db file sequential read' ela= 13505 file#=257 block#=1265138 blocks=1 obj#=211550 tim=52370729262412
    WAIT #69: nam='db file sequential read' ela= 8196 file#=257 block#=1263679 blocks=1 obj#=211550 tim=52370729270695
    WAIT #69: nam='db file sequential read' ela= 10854 file#=266 block#=1164488 blocks=1 obj#=211554 tim=52370729281699
    WAIT #69: nam='db file sequential read' ela= 6916 file#=257 block#=1266913 blocks=1 obj#=211550 tim=52370729288757
    WAIT #69: nam='db file sequential read' ela= 7534 file#=266 block#=1164629 blocks=1 obj#=211554 tim=52370729296373
    WAIT #69: nam='db file sequential read' ela= 11014 file#=262 block#=1215470 blocks=1 obj#=211550 tim=52370729307690
    WAIT #69: nam='db file sequential read' ela= 5464 file#=266 block#=1164797 blocks=1 obj#=211554 tim=52370729313292
    WAIT #69: nam='db file sequential read' ela= 7087 file#=266 block#=1164693 blocks=1 obj#=211554 tim=52370729320517
    WAIT #69: nam='db file sequential read' ela= 9120 file#=266 block#=1164681 blocks=1 obj#=211554 tim=52370729329728
    WAIT #69: nam='db file sequential read' ela= 9765 file#=263 block#=897664 blocks=1 obj#=211550 tim=52370729339566
    WAIT #69: nam='db file sequential read' ela= 9884 file#=266 block#=1164905 blocks=1 obj#=211554 tim=52370729349619
    WAIT #69: nam='db file sequential read' ela= 6704 file#=263 block#=896829 blocks=1 obj#=211550 tim=52370729356392
    WAIT #69: nam='db file sequential read' ela= 11685 file#=263 block#=898947 blocks=1 obj#=211550 tim=52370729368192
    WAIT #69: nam='db file sequential read' ela= 4374 file#=257 block#=1265707 blocks=1 obj#=211550 tim=52370729372745
    WAIT #69: nam='db file sequential read' ela= 10176 file#=266 block#=1164943 blocks=1 obj#=211554 tim=52370729383118
    WAIT #69: nam='db file sequential read' ela= 5685 file#=263 block#=898167 blocks=1 obj#=211550 tim=52370729388872
    WAIT #69: nam='db file sequential read' ela= 12407 file#=263 block#=898515 blocks=1 obj#=211550 tim=52370729401365
    WAIT #69: nam='db file sequential read' ela= 8000 file#=266 block#=1165054 blocks=1 obj#=211554 tim=52370729409495
    WAIT #69: nam='db file sequential read' ela= 10842 file#=263 block#=898387 blocks=1 obj#=211550 tim=52370729420558
    WAIT #69: nam='db file sequential read' ela= 9459 file#=266 block#=1164884 blocks=1 obj#=211554 tim=52370729430097
    WAIT #69: nam='db file sequential read' ela= 9176 file#=263 block#=899680 blocks=1 obj#=211550 tim=52370729439344
    WAIT #69: nam='db file sequential read' ela= 9433 file#=266 block#=1165227 blocks=1 obj#=211554 tim=52370729448885
    WAIT #69: nam='db file sequential read' ela= 9860 file#=264 block#=1166755 blocks=1 obj#=211550 tim=52370729458836
    WAIT #69: nam='db file sequential read' ela= 15433 file#=263 block#=899734 blocks=1 obj#=211550 tim=52370729474390
    WAIT #69: nam='db file sequential read' ela= 2525 file#=264 block#=1168289 blocks=1 obj#=211550 tim=52370729477056
    WAIT #69: nam='db file sequential read' ela= 14267 file#=264 block#=1166072 blocks=1 obj#=211550 tim=52370729491427
    WAIT #69: nam='db file sequential read' ela= 4862 file#=266 block#=1165189 blocks=1 obj#=211554 tim=52370729496443
    WAIT #69: nam='db file sequential read' ela= 6263 file#=264 block#=1165451 blocks=1 obj#=211550 tim=52370729502825
    WAIT #69: nam='db file sequential read' ela= 7960 file#=266 block#=1165114 blocks=1 obj#=211554 tim=52370729510959
    WAIT #69: nam='db file sequential read' ela= 11191 file#=263 block#=899590 blocks=1 obj#=211550 tim=52370729522220
    WAIT #69: nam='db file sequential read' ela= 16590 file#=264 block#=1166723 blocks=1 obj#=211550 tim=52370729539121
    WAIT #69: nam='db file sequential read' ela= 17177 file#=264 block#=1169520 blocks=1 obj#=211550 tim=52370729556924
    WAIT #69: nam='db file sequential read' ela= 10969 file#=265 block#=1042853 blocks=1 obj#=211550 tim=52370729568068
    WAIT #69: nam='db file sequential read' ela= 9767 file#=264 block#=1169529 blocks=1 obj#=211550 tim=52370729578053
    WAIT #69: nam='db file sequential read' ela= 8938 file#=267 block#=874778 blocks=1 obj#=211554 tim=52370729587146
    WAIT #69: nam='db file sequential read' ela= 14940 file#=7 block#=192178 blocks=1 obj#=211550 tim=52370729602405
    WAIT #69: nam='db file sequential read' ela= 11551 file#=7 block#=194476 blocks=1 obj#=211550 tim=52370729614295
    FETCH #69:c=50000,e=2063972,p=176,cr=1956,cu=0,mis=0,r=489,dep=0,og=1,tim=52370729615480
    *** 2010-01-29 03:51:49.192
    Received ORADEBUG command 'tracefile_name' from process Unix process pid: 9597, image:
    *** 2010-01-29 03:53:38.062
    Received ORADEBUG command 'event 10046 trace name context off' from process Unix process pid: 9597, image:
    Dump file /oracle/S1P/saptrace/usertrace/s1p_ora_9761.trc
    Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
    With the Partitioning, Data Mining and Real Application Testing options
    ORACLE_HOME = /oracle/S1P/102_64
    System name:    HP-UX
    Node name:      nlscuxp0
    Release:        B.11.23
    Version:        U
    Machine:        ia64
    Instance name: S1P
    Redo thread mounted by this instance: 1
    Oracle process number: 0
    Unix process pid: 9761, image: oracle@nlscuxp0
  • 3. Re: Anomaly in TKPROF; Encountered during Performance Tuning
    CharlesHooper Expert
    Currently Being Moderated
    Thanks for posting the requested information.

    If we just analyze the timestamps and the wait events in what you posted (all converted to seconds, assuming division by 1,000,000 yields seconds):
    Timestamp    T Delta  Wait Time  T Delta - Wait
    52370728.99            0.010255  
    52370729.01  0.023465  0.023269  0.000196
    52370729.03  0.014775  0.014643  0.000132
    52370729.03  0.004227  0.004114  0.000113
    52370729.04  0.013275  0.013199  0.000076
    52370729.06  0.015754  0.015608  0.000146
    52370729.07  0.011475  0.011327  0.000148
    52370729.07  0.003122  0.003025  0.000097
    52370729.08  0.009700  0.009539  0.000161
    52370729.10  0.013614  0.013474  0.000140
    52370729.11  0.015760  0.015618  0.000142
    52370729.12  0.007348  0.007040  0.000308
    52370729.13  0.008073  0.007995  0.000078
    52370729.14  0.009248  0.009090  0.000158
    52370729.14  0.006585  0.006459  0.000126
    52370729.16  0.015066  0.014971  0.000095
    52370729.17  0.008651  0.008528  0.000123
    52370729.19  0.018330  0.017749  0.000581
    52370729.20  0.014382  0.014060  0.000322
    52370729.21  0.006066  0.005944  0.000122
    52370729.21  0.007946  0.007822  0.000124
    52370729.22  0.009361  0.009290  0.000071
    52370729.23  0.008273  0.008096  0.000177
    52370729.25  0.017346  0.017243  0.000103
    52370729.26  0.013631  0.013505  0.000126
    52370729.27  0.008283  0.008196  0.000087
    52370729.28  0.011004  0.010854  0.000150
    52370729.29  0.007058  0.006916  0.000142
    52370729.30  0.007616  0.007534  0.000082
    52370729.31  0.011317  0.011014  0.000303
    52370729.31  0.005602  0.005464  0.000138
    52370729.32  0.007225  0.007087  0.000138
    52370729.33  0.009211  0.009120  0.000091
    52370729.34  0.009838  0.009765  0.000073
    52370729.35  0.010053  0.009884  0.000169
    52370729.36  0.006773  0.006704  0.000069
    52370729.37  0.011800  0.011685  0.000115
    52370729.37  0.004553  0.004374  0.000179
    52370729.38  0.010373  0.010176  0.000197
    52370729.39  0.005754  0.005685  0.000069
    52370729.40  0.012493  0.012407  0.000086
    52370729.41  0.008130  0.008000  0.000130
    52370729.42  0.011063  0.010842  0.000221
    52370729.43  0.009539  0.009459  0.000080
    52370729.44  0.009247  0.009176  0.000071
    52370729.45  0.009541  0.009433  0.000108
    52370729.46  0.009951  0.009860  0.000091
    52370729.47  0.015554  0.015433  0.000121
    52370729.48  0.002666  0.002525  0.000141
    52370729.49  0.014371  0.014267  0.000104
    52370729.50  0.005016  0.004862  0.000154
    52370729.50  0.006382  0.006263  0.000119
    52370729.51  0.008134  0.007960  0.000174
    52370729.52  0.011261  0.011191  0.000070
    52370729.54  0.016901  0.016590  0.000311
    52370729.56  0.017803  0.017177  0.000626
    52370729.57  0.011144  0.010969  0.000175
    52370729.58  0.009985  0.009767  0.000218
    52370729.59  0.009093  0.008938  0.000155
    52370729.60  0.015259  0.014940  0.000319
    52370729.61  0.011890  0.011551  0.000339
    In most cases, the difference between the time stamps and the wait event time is less than 0.00017 seconds, but this occasionally jumps to 0.000626, 0.000581, 0.000339 seconds. Someone might be able to say for certain that this means that occasionally there is a lot of competition for CPU and that the process sometimes sits in the run queue 3 to 5 times longer than it typically does (someone else might see an entirely different pattern).

    Unfortunately, no timestamps are written after the final fetch call that retrieved 489 rows and nothing was written to the trace file in the nearly 2 seconds between the time ORADEBUG was used to retrieve the tracefile name and ORADEBUG was used to disable tracing:
    FETCH #69:c=50000,e=2063972,p=176,cr=1956,cu=0,mis=0,r=489,dep=0,og=1,tim=52370729615480
    *** 2010-01-29 03:51:49.192
    Received ORADEBUG command 'tracefile_name' from process Unix process pid: 9597, image:
    *** 2010-01-29 03:53:38.062
    Received ORADEBUG command 'event 10046 trace name context off' from process Unix process pid: 9597, image:
    Notice that the final fetch in the trace file required 2.063972 seconds, with 0.050000 seconds of CPU time. If the SQL statement was still actively processing at that point, Oracle probably should have written a "WAIT #69: nam='SQL*Net message from client'", "WAIT #69: nam='SQL*Net message to client'", another fetch call (assuming no I/O was required), or another single block/multiblock read in that nearly 2 second time period.

    Sitting in the CPU run queue will cause "lost" time where the time is not recorded in a wait event or on the CPU - but the process would have had to sit in the run queue for a very long portion of the total execution time for that to be an explanation. I wonder if you would see different results if you did not enable and disable the trace using ORADEBUG? You can find other methods for enabling a 10046 trace here:
    http://hoopercharles.wordpress.com/2009/12/01/10046-extended-sql-trace-interpretation/

    Maybe someone else will see something that I missed?

    Charles Hooper
    Co-author of "Expert Oracle Practices: Oracle Database Administration from the Oak Table"
    http://hoopercharles.wordpress.com/
    IT Manager/Oracle DBA
    K&M Machine-Fabricating, Inc.
  • 4. Re: Anomaly in TKPROF; Encountered during Performance Tuning
    711242 Newbie
    Currently Being Moderated
    Thanks a lot charles, for putting across your analysis in a simple way.

    So I understand that It is the lost time due to High contention for CPU. I have a Q? then.. If my tracing encountered such a "lost cpu" wouldn't it be another wait event like CPU WAIT or.. something like that.

    I must say, you have posted very good articles on the above said link but Does my tracing content depend on how I did the trace ?

    Additonaly, I did another trace of the same JOB and this time I did not switch off the tracing but I terminated the JOB still I see same anomaly.

    One BIG Question ==>
    From my above TKPROF o/p can I infer that the DB time is 4000 and odd seconds ? and delay beyond this has nothing to do with my Oracle DB ?
    AND
    DO you have any tips as to Tune this case?
  • 5. Re: Anomaly in TKPROF; Encountered during Performance Tuning
    CharlesHooper Expert
    Currently Being Moderated
    I believe that you should be able to execute a command like this on Unix/Linux to check the CPU run queue once a minute for 10 minutes:
    sar –q 60 10
    The above might produce output that looks something like this (output of sar –q 60 5):
    03:58:27 PM runq-sz plist-sz ldavg-1 ldavg-5 ldavg-15
    03:59:27 PM      42      332   39.21   20.83     8.81
    04:00:27 PM      48      348   47.13   26.40    11.47
    04:01:27 PM      57      361   52.45   31.67    14.21
    04:02:27 PM      45      416   59.42   37.36    17.25
    04:03:27 PM      26      338   57.97   41.22    19.85
    Average:         44      359   51.24   31.50    14.32
    In the above, the runq-sz column indicates the CPU run queue length - the number of processes/threads that are waiting for time to run on the CPUs. A run queue of 42 is very bad if the server has a single CPU, but probably reasonable if the server has 42 CPUs.

    There is no "CPU WAIT" that is output to the trace file (as far as I know - although it is possible that "resmgr: cpu quantum" might appear, see http://download.oracle.com/docs/cd/B19306_01/server.102/b14237/waitevents003.htm). You can possibly monitor the delta values in V$OSSTAT, V$SYS_TIME_MODEL, or V$SESS_TIME_MODEL (for the specific session) to see what is happening - V$SESS_TIME_MODEL will list the "DB time" statistic for each session.

    If you have access to a client computer running Windows, you could try something like this to watch the session activity:
    http://hoopercharles.wordpress.com/2010/01/13/working-with-oracle-time-model-data/

    I think that I have only once enabled a 10046 trace with ORADEBUG (essentially just, neat - it works), so I can't answer how that affects the trace file output.

    If you spend some time manually analyzing the trace file looking for gaps in the tim= values that cannot be associated with wait events or CPU time, you might be able to figure out what is happening to the lost time.

    Charles Hooper
    Co-author of "Expert Oracle Practices: Oracle Database Administration from the Oak Table"
    http://hoopercharles.wordpress.com/
    IT Manager/Oracle DBA
    K&M Machine-Fabricating, Inc.

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points