This discussion is archived
11 Replies Latest reply: Apr 1, 2010 3:22 PM by CharlesHooper RSS

Bad performance - Comparing stats pack information from a good time period:

761063 Newbie
Currently Being Moderated
Users are complaining about a spike in performance. I pulled AWR reports 1 hour before and then during the issue.

I pulled 1 hour in length reports.

In comparing the trouble hour with the good hour I found the following:

1. Massive Redo. 270K/sec vs. 35K/sec
2. dbfile sequential reads higher
3. Higher hard parse/parse percentages
4. Higher db time
5. SQL had 32 elap per execs vs. nearly 0 (not sure what this means).
6. massive undo - transactions were 180K vs. 18K the hour previous.
7. Specifically at 12:14 there were 86k ransactions with 20K undo blocks (the spike time).
8. SQL area in the library cache stats showed a 52% miss ratio.

Any ideas where I can go with this data? What do I point the finger at? I am guessing there was some massing ad-hoc sql being run? Sorry I am new to P/T.

thanks for any insight.

-dallas
  • 1. Re: Bad performance - Comparing stats pack information from a good time period:
    sb92075 Guru
    Currently Being Moderated
    I am guessing there was some massing ad-hoc sql being run?
    SELECT does not generate redo/undo
  • 2. Re: Bad performance - Comparing stats pack information from a good time period:
    761063 Newbie
    Currently Being Moderated
    So what would be your guess? What information in the AWR reports would or could I look into for a RCA. At this point I'm just asking to see where you guys would start digging. If DML-ish type stuff is not going to create what appears to be a redo issue and DDL would - do I look into the sql executions? The highest (time-wise) sql is a select statement (900 seconds, taking 17% of the dbtime). /shrug

    -dallas
  • 3. Re: Bad performance - Comparing stats pack information from a good time period:
    sb92075 Guru
    Currently Being Moderated
    Post Operating System (OS) name & version for DB server system.
    Post results of
    SELECT * from v$version

    In my opinion, AWR is next to useless to identifying performance issue.
  • 4. Re: Bad performance - Comparing stats pack information from a good time period:
    sybrand_b Guru
    Currently Being Moderated
    1 hour intervals are fully useless.

    --------
    Sybrand Bakker
    Senior Oracle DBA
  • 5. Re: Bad performance - Comparing stats pack information from a good time period:
    680087 Pro
    Currently Being Moderated
    sb92075 wrote:
    SELECT does not generate redo/undo
    SQL> select n.name, s.value from v$sesstat s, v$statname n
      2  where s.statistic#=n.statistic# and s.sid=userenv('sid') and (n.name like 'undo%' or n.name = 'redo size');
    
    NAME                                                                  VALUE
    ---------------------------------------------------------------- ----------
    redo size                                                                 0
    undo change vector size                                                   0
    undo segment header was pinned                                            0
    
    SQL> with t as (select null from v$statname)
      2  select count(*)
      3    from t t1, t t2;
    
      COUNT(*)
    ----------
        132496
    
    SQL> select n.name, s.value from v$sesstat s, v$statname n
      2  where s.statistic#=n.statistic# and s.sid=userenv('sid') and (n.name like 'undo%' or n.name = 'redo size');
    
    NAME                                                                  VALUE
    ---------------------------------------------------------------- ----------
    redo size                                                               736
    undo change vector size                                                 168
    undo segment header was pinned                                            0
    There is delayed block cleanout too.
  • 6. Re: Bad performance - Comparing stats pack information from a good time period:
    sybrand_b Guru
    Currently Being Moderated
    You look in the TOP-5 events first

    -------
    Sybrand Bakker
    Senior Oracle DBA
  • 7. Re: Bad performance - Comparing stats pack information from a good time period:
    761063 Newbie
    Currently Being Moderated
    10.2.0.4.0 64bit on 64bit linux

    Top 5 Timed Events

    Event_____________|__Waits_|_Time(s)_|_Avg Wait(ms)_|_% Total Call Time_|_Wait Class
    CPU time_______________________1,538__________________________34.0
    db file sequential read___181,367_____902_____________5____________20.0____User I/O
    gc buffer busy__________40,632_____640____________16____________14.2_____Cluster
    gc current block busy___13,575______405____________30_____________9.0_____Cluster
    log file parallel write_____95,390______293_____________3_____________6.5__System I/O
  • 8. Re: Bad performance - Comparing stats pack information from a good time period:
    761063 Newbie
    Currently Being Moderated
    IS or HOW would disk contention on index reads be related to high REDO?
  • 9. Re: Bad performance - Comparing stats pack information from a good time period:
    370095 Explorer
    Currently Being Moderated
    there is clear indication that some DML statements were running on database. RUN ASH report and check for insert, update and delete statements executed in that period.
  • 10. Re: Bad performance - Comparing stats pack information from a good time period:
    sb92075 Guru
    Currently Being Moderated
    IS or HOW would disk contention on index reads be related to high REDO?
    High REDO results from DML.
    Index reads could result from using indexes to decide which rows get changed.

    It would help if you could identify the SQL producing the high redo rates.
  • 11. Re: Bad performance - Comparing stats pack information from a good time period:
    CharlesHooper Expert
    Currently Being Moderated
    Top 5 Timed Events 
     
    Event                       Waits | Time(s) | Avg Wait(ms) | % Total Call Time | Wait Class
    CPU time                              1,538                              34.0 
    db file sequential read   181,367       902              5               20.0    User I/O 
    gc buffer busy             40,632       640             16               14.2     Cluster 
    gc current block busy      13,575       405             30                9.0     Cluster 
    log file parallel write    95,390       293              3                6.5  System I/O
    I agree with Sybrand in this case. The average waits do not show much of a problem for this time period, and this instance in the RAC configuration. For this one hour time period, if the server only had a single CPU, it would only be about 43% busy - you can use that statistic to get a handle on the average impact of the other true wait events that are listed in the top 5. There is no log file sync wait i(foreground) n the top 5, which is somewhat interesting given that there were 95,390 waits on log file parallel write (the background wait). The average wait time for the global cache wait events seems to be a little high.

    Suggestions:
    1. Check the documentation: http://download.oracle.com/docs/cd/B19306_01/rac.102/b28759/racmon.htm
    2. Since you are looking at an AWR report, I assume that you have the appropriate license for that feature - if so that means that you should be able to view an ADDM report.
    3. Take a look at the activity of the other nodes in the RAC configuration, generate an AWR report for those nodes.
    4. Try to generate an AWR report for a 10 to 15 minute time period during the period of poor performance.
    5. Find a session that is exhibiting a performance problem and enable a 10046 trace at level 8 or 12 for that session.
    6. When posting information where spacing is critical, use a { code } tag (without spaces) before and after that section of your post.

    Charles Hooper
    Co-author of "Expert Oracle Practices: Oracle Database Administration from the Oak Table"
    http://hoopercharles.wordpress.com/
    IT Manager/Oracle DBA
    K&M Machine-Fabricating, Inc.

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points