5 Replies Latest reply: Feb 2, 2010 12:28 PM by Charles Hooper RSS

    Anomaly in TKPROF; Encountered during Performance Tuning

    user4566776
      A slow response JOB was traced using ORADEBUG 10046 trace event. The job ran for about 60K Seconds and tracing was offed after 45K second. Strangely I see that the total time spent on Non-Recursive SQL statements is mot more than 5000 Seconds wheras the tkprof o/p has 45000 seconds as in the last line of tkprof O/p.

      I am surprised as to where the remaining time of 45080-5000 ~ *40000* Seconds go unaccounted. Please help me out figure this.
      Is there any anomaly in TKPROF ?

      Pelase refer the last few lines of the TKPROF O/P
      ********************************************************************************
      OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS
      
      call     count       cpu    elapsed       disk      query    current        rows
      ------- ------  -------- ---------- ---------- ---------- ----------  ----------
      Parse       34      0.07       0.05          0          0          0           0
      Execute 949806    924.33    1010.25          0          0          0          34
      Fetch   964106    264.38    2540.21     301146   18226522          0     5094481
      ------- ------  -------- ---------- ---------- ---------- ----------  ----------
      total   1913946   1188.78    3550.51     301146   18226522          0     5094515
      
      Misses in library cache during parse: 5
      Misses in library cache during execute: 5
      
      Elapsed times include waiting on following events:
        Event waited on                             Times   Max. Wait  Total Waited
        ----------------------------------------   Waited  ----------  ------------
        db file sequential read                    301151        0.49       1598.40
        SQL*Net message to client                  964141        0.01          1.54
        SQL*Net message from client                964140      973.95       1720.04
        read by other session                      201948        0.98        705.53
        latch: cache buffers chains                 11496        0.01          1.40
        latch free                                     38        0.00          0.02
        cursor: pin S                                  27        0.01          0.19
        SQL*Net more data from client                  87        0.00          0.00
        wait list latch free                            1        0.01          0.01
        latch: object queue header operation            8        0.00          0.00
      
      
      OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS
      
      call     count       cpu    elapsed       disk      query    current        rows
      ------- ------  -------- ---------- ---------- ---------- ----------  ----------
      Parse        7      0.00       0.00          0          0          0           0
      Execute      7      0.00       0.00          0         21          0           0
      Fetch        0      0.00       0.00          0          0          0           0
      ------- ------  -------- ---------- ---------- ---------- ----------  ----------
      total       14      0.00       0.00          0         21          0           0
      
      Misses in library cache during parse: 0
      
         45  user  SQL statements in session.
          7  internal SQL statements in session.
         52  SQL statements in session.
      ********************************************************************************
      Trace file: s1p_ora_9761.trc
      Trace file compatibility: 10.01.00
      Sort options: exeela  fchela  prsela
             1  session in tracefile.
            45  user  SQL statements in trace file.
             7  internal SQL statements in trace file.
            52  SQL statements in trace file.
            46  unique SQL statements in trace file.
       71815238  lines in trace file.
         45080  elapsed seconds in trace file.
      Regards,
      Valli

      Edited by: user4566776 on Feb 1, 2010 6:57 AM
        • 1. Re: Anomaly in TKPROF; Encountered during Performance Tuning
          Charles Hooper
          Please post the last 30 lines of the raw 10046 trace file.

          Making a wild guess here - it could be that a very long running query that performed 100% logical IO (100% on the CPU or stuck in the CPU queue) was executing for the majority of that time, and had not yet fetched its first row. If that is the case, the last 30 or so lines should include just the periodic output of the current server time.

          Charles Hooper
          Co-author of "Expert Oracle Practices: Oracle Database Administration from the Oak Table"
          http://hoopercharles.wordpress.com/
          IT Manager/Oracle DBA
          K&M Machine-Fabricating, Inc.
          • 2. Re: Anomaly in TKPROF; Encountered during Performance Tuning
            user4566776
            Here it is, More than 30 lines from the bottom. Please let me know if you figured any reason for the anomaly ?
            WAIT #69: nam='db file sequential read' ela= 10255 file#=253 block#=514924 blocks=1 obj#=211550 tim=52370728986939
            WAIT #69: nam='db file sequential read' ela= 23269 file#=260 block#=1202218 blocks=1 obj#=211554 tim=52370729010404
            WAIT #69: nam='db file sequential read' ela= 14643 file#=260 block#=1202511 blocks=1 obj#=211554 tim=52370729025179
            WAIT #69: nam='db file sequential read' ela= 4114 file#=265 block#=1033501 blocks=1 obj#=211554 tim=52370729029406
            WAIT #69: nam='db file sequential read' ela= 13199 file#=253 block#=516876 blocks=1 obj#=211550 tim=52370729042681
            WAIT #69: nam='db file sequential read' ela= 15608 file#=253 block#=517214 blocks=1 obj#=211550 tim=52370729058435
            WAIT #69: nam='db file sequential read' ela= 11327 file#=253 block#=517552 blocks=1 obj#=211550 tim=52370729069910
            WAIT #69: nam='db file sequential read' ela= 3025 file#=265 block#=1033566 blocks=1 obj#=211554 tim=52370729073032
            WAIT #69: nam='db file sequential read' ela= 9539 file#=255 block#=981553 blocks=1 obj#=211550 tim=52370729082732
            WAIT #69: nam='db file sequential read' ela= 13474 file#=257 block#=1259903 blocks=1 obj#=211550 tim=52370729096346
            WAIT #69: nam='db file sequential read' ela= 15618 file#=265 block#=1033723 blocks=1 obj#=211554 tim=52370729112106
            WAIT #69: nam='db file sequential read' ela= 7040 file#=265 block#=1033868 blocks=1 obj#=211554 tim=52370729119454
            WAIT #69: nam='db file sequential read' ela= 7995 file#=257 block#=1261555 blocks=1 obj#=211550 tim=52370729127527
            WAIT #69: nam='db file sequential read' ela= 9090 file#=257 block#=1261616 blocks=1 obj#=211550 tim=52370729136775
            WAIT #69: nam='db file sequential read' ela= 6459 file#=257 block#=1260091 blocks=1 obj#=211550 tim=52370729143360
            WAIT #69: nam='db file sequential read' ela= 14971 file#=265 block#=1033992 blocks=1 obj#=211554 tim=52370729158426
            WAIT #69: nam='db file sequential read' ela= 8528 file#=257 block#=1260350 blocks=1 obj#=211550 tim=52370729167077
            WAIT #69: nam='db file sequential read' ela= 17749 file#=257 block#=1262569 blocks=1 obj#=211550 tim=52370729185407
            WAIT #69: nam='db file sequential read' ela= 14060 file#=257 block#=1262585 blocks=1 obj#=211550 tim=52370729199789
            WAIT #69: nam='db file sequential read' ela= 5944 file#=265 block#=1034201 blocks=1 obj#=211554 tim=52370729205855
            WAIT #69: nam='db file sequential read' ela= 7822 file#=265 block#=1034081 blocks=1 obj#=211554 tim=52370729213801
            WAIT #69: nam='db file sequential read' ela= 9290 file#=257 block#=1260904 blocks=1 obj#=211550 tim=52370729223162
            WAIT #69: nam='db file sequential read' ela= 8096 file#=257 block#=1263630 blocks=1 obj#=211550 tim=52370729231435
            WAIT #69: nam='db file sequential read' ela= 17243 file#=266 block#=1164414 blocks=1 obj#=211554 tim=52370729248781
            WAIT #69: nam='db file sequential read' ela= 13505 file#=257 block#=1265138 blocks=1 obj#=211550 tim=52370729262412
            WAIT #69: nam='db file sequential read' ela= 8196 file#=257 block#=1263679 blocks=1 obj#=211550 tim=52370729270695
            WAIT #69: nam='db file sequential read' ela= 10854 file#=266 block#=1164488 blocks=1 obj#=211554 tim=52370729281699
            WAIT #69: nam='db file sequential read' ela= 6916 file#=257 block#=1266913 blocks=1 obj#=211550 tim=52370729288757
            WAIT #69: nam='db file sequential read' ela= 7534 file#=266 block#=1164629 blocks=1 obj#=211554 tim=52370729296373
            WAIT #69: nam='db file sequential read' ela= 11014 file#=262 block#=1215470 blocks=1 obj#=211550 tim=52370729307690
            WAIT #69: nam='db file sequential read' ela= 5464 file#=266 block#=1164797 blocks=1 obj#=211554 tim=52370729313292
            WAIT #69: nam='db file sequential read' ela= 7087 file#=266 block#=1164693 blocks=1 obj#=211554 tim=52370729320517
            WAIT #69: nam='db file sequential read' ela= 9120 file#=266 block#=1164681 blocks=1 obj#=211554 tim=52370729329728
            WAIT #69: nam='db file sequential read' ela= 9765 file#=263 block#=897664 blocks=1 obj#=211550 tim=52370729339566
            WAIT #69: nam='db file sequential read' ela= 9884 file#=266 block#=1164905 blocks=1 obj#=211554 tim=52370729349619
            WAIT #69: nam='db file sequential read' ela= 6704 file#=263 block#=896829 blocks=1 obj#=211550 tim=52370729356392
            WAIT #69: nam='db file sequential read' ela= 11685 file#=263 block#=898947 blocks=1 obj#=211550 tim=52370729368192
            WAIT #69: nam='db file sequential read' ela= 4374 file#=257 block#=1265707 blocks=1 obj#=211550 tim=52370729372745
            WAIT #69: nam='db file sequential read' ela= 10176 file#=266 block#=1164943 blocks=1 obj#=211554 tim=52370729383118
            WAIT #69: nam='db file sequential read' ela= 5685 file#=263 block#=898167 blocks=1 obj#=211550 tim=52370729388872
            WAIT #69: nam='db file sequential read' ela= 12407 file#=263 block#=898515 blocks=1 obj#=211550 tim=52370729401365
            WAIT #69: nam='db file sequential read' ela= 8000 file#=266 block#=1165054 blocks=1 obj#=211554 tim=52370729409495
            WAIT #69: nam='db file sequential read' ela= 10842 file#=263 block#=898387 blocks=1 obj#=211550 tim=52370729420558
            WAIT #69: nam='db file sequential read' ela= 9459 file#=266 block#=1164884 blocks=1 obj#=211554 tim=52370729430097
            WAIT #69: nam='db file sequential read' ela= 9176 file#=263 block#=899680 blocks=1 obj#=211550 tim=52370729439344
            WAIT #69: nam='db file sequential read' ela= 9433 file#=266 block#=1165227 blocks=1 obj#=211554 tim=52370729448885
            WAIT #69: nam='db file sequential read' ela= 9860 file#=264 block#=1166755 blocks=1 obj#=211550 tim=52370729458836
            WAIT #69: nam='db file sequential read' ela= 15433 file#=263 block#=899734 blocks=1 obj#=211550 tim=52370729474390
            WAIT #69: nam='db file sequential read' ela= 2525 file#=264 block#=1168289 blocks=1 obj#=211550 tim=52370729477056
            WAIT #69: nam='db file sequential read' ela= 14267 file#=264 block#=1166072 blocks=1 obj#=211550 tim=52370729491427
            WAIT #69: nam='db file sequential read' ela= 4862 file#=266 block#=1165189 blocks=1 obj#=211554 tim=52370729496443
            WAIT #69: nam='db file sequential read' ela= 6263 file#=264 block#=1165451 blocks=1 obj#=211550 tim=52370729502825
            WAIT #69: nam='db file sequential read' ela= 7960 file#=266 block#=1165114 blocks=1 obj#=211554 tim=52370729510959
            WAIT #69: nam='db file sequential read' ela= 11191 file#=263 block#=899590 blocks=1 obj#=211550 tim=52370729522220
            WAIT #69: nam='db file sequential read' ela= 16590 file#=264 block#=1166723 blocks=1 obj#=211550 tim=52370729539121
            WAIT #69: nam='db file sequential read' ela= 17177 file#=264 block#=1169520 blocks=1 obj#=211550 tim=52370729556924
            WAIT #69: nam='db file sequential read' ela= 10969 file#=265 block#=1042853 blocks=1 obj#=211550 tim=52370729568068
            WAIT #69: nam='db file sequential read' ela= 9767 file#=264 block#=1169529 blocks=1 obj#=211550 tim=52370729578053
            WAIT #69: nam='db file sequential read' ela= 8938 file#=267 block#=874778 blocks=1 obj#=211554 tim=52370729587146
            WAIT #69: nam='db file sequential read' ela= 14940 file#=7 block#=192178 blocks=1 obj#=211550 tim=52370729602405
            WAIT #69: nam='db file sequential read' ela= 11551 file#=7 block#=194476 blocks=1 obj#=211550 tim=52370729614295
            FETCH #69:c=50000,e=2063972,p=176,cr=1956,cu=0,mis=0,r=489,dep=0,og=1,tim=52370729615480
            *** 2010-01-29 03:51:49.192
            Received ORADEBUG command 'tracefile_name' from process Unix process pid: 9597, image:
            *** 2010-01-29 03:53:38.062
            Received ORADEBUG command 'event 10046 trace name context off' from process Unix process pid: 9597, image:
            Dump file /oracle/S1P/saptrace/usertrace/s1p_ora_9761.trc
            Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
            With the Partitioning, Data Mining and Real Application Testing options
            ORACLE_HOME = /oracle/S1P/102_64
            System name:    HP-UX
            Node name:      nlscuxp0
            Release:        B.11.23
            Version:        U
            Machine:        ia64
            Instance name: S1P
            Redo thread mounted by this instance: 1
            Oracle process number: 0
            Unix process pid: 9761, image: oracle@nlscuxp0
            • 3. Re: Anomaly in TKPROF; Encountered during Performance Tuning
              Charles Hooper
              Thanks for posting the requested information.

              If we just analyze the timestamps and the wait events in what you posted (all converted to seconds, assuming division by 1,000,000 yields seconds):
              Timestamp    T Delta  Wait Time  T Delta - Wait
              52370728.99            0.010255  
              52370729.01  0.023465  0.023269  0.000196
              52370729.03  0.014775  0.014643  0.000132
              52370729.03  0.004227  0.004114  0.000113
              52370729.04  0.013275  0.013199  0.000076
              52370729.06  0.015754  0.015608  0.000146
              52370729.07  0.011475  0.011327  0.000148
              52370729.07  0.003122  0.003025  0.000097
              52370729.08  0.009700  0.009539  0.000161
              52370729.10  0.013614  0.013474  0.000140
              52370729.11  0.015760  0.015618  0.000142
              52370729.12  0.007348  0.007040  0.000308
              52370729.13  0.008073  0.007995  0.000078
              52370729.14  0.009248  0.009090  0.000158
              52370729.14  0.006585  0.006459  0.000126
              52370729.16  0.015066  0.014971  0.000095
              52370729.17  0.008651  0.008528  0.000123
              52370729.19  0.018330  0.017749  0.000581
              52370729.20  0.014382  0.014060  0.000322
              52370729.21  0.006066  0.005944  0.000122
              52370729.21  0.007946  0.007822  0.000124
              52370729.22  0.009361  0.009290  0.000071
              52370729.23  0.008273  0.008096  0.000177
              52370729.25  0.017346  0.017243  0.000103
              52370729.26  0.013631  0.013505  0.000126
              52370729.27  0.008283  0.008196  0.000087
              52370729.28  0.011004  0.010854  0.000150
              52370729.29  0.007058  0.006916  0.000142
              52370729.30  0.007616  0.007534  0.000082
              52370729.31  0.011317  0.011014  0.000303
              52370729.31  0.005602  0.005464  0.000138
              52370729.32  0.007225  0.007087  0.000138
              52370729.33  0.009211  0.009120  0.000091
              52370729.34  0.009838  0.009765  0.000073
              52370729.35  0.010053  0.009884  0.000169
              52370729.36  0.006773  0.006704  0.000069
              52370729.37  0.011800  0.011685  0.000115
              52370729.37  0.004553  0.004374  0.000179
              52370729.38  0.010373  0.010176  0.000197
              52370729.39  0.005754  0.005685  0.000069
              52370729.40  0.012493  0.012407  0.000086
              52370729.41  0.008130  0.008000  0.000130
              52370729.42  0.011063  0.010842  0.000221
              52370729.43  0.009539  0.009459  0.000080
              52370729.44  0.009247  0.009176  0.000071
              52370729.45  0.009541  0.009433  0.000108
              52370729.46  0.009951  0.009860  0.000091
              52370729.47  0.015554  0.015433  0.000121
              52370729.48  0.002666  0.002525  0.000141
              52370729.49  0.014371  0.014267  0.000104
              52370729.50  0.005016  0.004862  0.000154
              52370729.50  0.006382  0.006263  0.000119
              52370729.51  0.008134  0.007960  0.000174
              52370729.52  0.011261  0.011191  0.000070
              52370729.54  0.016901  0.016590  0.000311
              52370729.56  0.017803  0.017177  0.000626
              52370729.57  0.011144  0.010969  0.000175
              52370729.58  0.009985  0.009767  0.000218
              52370729.59  0.009093  0.008938  0.000155
              52370729.60  0.015259  0.014940  0.000319
              52370729.61  0.011890  0.011551  0.000339
              In most cases, the difference between the time stamps and the wait event time is less than 0.00017 seconds, but this occasionally jumps to 0.000626, 0.000581, 0.000339 seconds. Someone might be able to say for certain that this means that occasionally there is a lot of competition for CPU and that the process sometimes sits in the run queue 3 to 5 times longer than it typically does (someone else might see an entirely different pattern).

              Unfortunately, no timestamps are written after the final fetch call that retrieved 489 rows and nothing was written to the trace file in the nearly 2 seconds between the time ORADEBUG was used to retrieve the tracefile name and ORADEBUG was used to disable tracing:
              FETCH #69:c=50000,e=2063972,p=176,cr=1956,cu=0,mis=0,r=489,dep=0,og=1,tim=52370729615480
              *** 2010-01-29 03:51:49.192
              Received ORADEBUG command 'tracefile_name' from process Unix process pid: 9597, image:
              *** 2010-01-29 03:53:38.062
              Received ORADEBUG command 'event 10046 trace name context off' from process Unix process pid: 9597, image:
              Notice that the final fetch in the trace file required 2.063972 seconds, with 0.050000 seconds of CPU time. If the SQL statement was still actively processing at that point, Oracle probably should have written a "WAIT #69: nam='SQL*Net message from client'", "WAIT #69: nam='SQL*Net message to client'", another fetch call (assuming no I/O was required), or another single block/multiblock read in that nearly 2 second time period.

              Sitting in the CPU run queue will cause "lost" time where the time is not recorded in a wait event or on the CPU - but the process would have had to sit in the run queue for a very long portion of the total execution time for that to be an explanation. I wonder if you would see different results if you did not enable and disable the trace using ORADEBUG? You can find other methods for enabling a 10046 trace here:
              http://hoopercharles.wordpress.com/2009/12/01/10046-extended-sql-trace-interpretation/

              Maybe someone else will see something that I missed?

              Charles Hooper
              Co-author of "Expert Oracle Practices: Oracle Database Administration from the Oak Table"
              http://hoopercharles.wordpress.com/
              IT Manager/Oracle DBA
              K&M Machine-Fabricating, Inc.
              • 4. Re: Anomaly in TKPROF; Encountered during Performance Tuning
                user4566776
                Thanks a lot charles, for putting across your analysis in a simple way.

                So I understand that It is the lost time due to High contention for CPU. I have a Q? then.. If my tracing encountered such a "lost cpu" wouldn't it be another wait event like CPU WAIT or.. something like that.

                I must say, you have posted very good articles on the above said link but Does my tracing content depend on how I did the trace ?

                Additonaly, I did another trace of the same JOB and this time I did not switch off the tracing but I terminated the JOB still I see same anomaly.

                One BIG Question ==>
                From my above TKPROF o/p can I infer that the DB time is 4000 and odd seconds ? and delay beyond this has nothing to do with my Oracle DB ?
                AND
                DO you have any tips as to Tune this case?
                • 5. Re: Anomaly in TKPROF; Encountered during Performance Tuning
                  Charles Hooper
                  I believe that you should be able to execute a command like this on Unix/Linux to check the CPU run queue once a minute for 10 minutes:
                  sar –q 60 10
                  The above might produce output that looks something like this (output of sar –q 60 5):
                  03:58:27 PM runq-sz plist-sz ldavg-1 ldavg-5 ldavg-15
                  03:59:27 PM      42      332   39.21   20.83     8.81
                  04:00:27 PM      48      348   47.13   26.40    11.47
                  04:01:27 PM      57      361   52.45   31.67    14.21
                  04:02:27 PM      45      416   59.42   37.36    17.25
                  04:03:27 PM      26      338   57.97   41.22    19.85
                  Average:         44      359   51.24   31.50    14.32
                  In the above, the runq-sz column indicates the CPU run queue length - the number of processes/threads that are waiting for time to run on the CPUs. A run queue of 42 is very bad if the server has a single CPU, but probably reasonable if the server has 42 CPUs.

                  There is no "CPU WAIT" that is output to the trace file (as far as I know - although it is possible that "resmgr: cpu quantum" might appear, see http://download.oracle.com/docs/cd/B19306_01/server.102/b14237/waitevents003.htm). You can possibly monitor the delta values in V$OSSTAT, V$SYS_TIME_MODEL, or V$SESS_TIME_MODEL (for the specific session) to see what is happening - V$SESS_TIME_MODEL will list the "DB time" statistic for each session.

                  If you have access to a client computer running Windows, you could try something like this to watch the session activity:
                  http://hoopercharles.wordpress.com/2010/01/13/working-with-oracle-time-model-data/

                  I think that I have only once enabled a 10046 trace with ORADEBUG (essentially just, neat - it works), so I can't answer how that affects the trace file output.

                  If you spend some time manually analyzing the trace file looking for gaps in the tim= values that cannot be associated with wait events or CPU time, you might be able to figure out what is happening to the lost time.

                  Charles Hooper
                  Co-author of "Expert Oracle Practices: Oracle Database Administration from the Oak Table"
                  http://hoopercharles.wordpress.com/
                  IT Manager/Oracle DBA
                  K&M Machine-Fabricating, Inc.