1 2 Previous Next 28 Replies Latest reply: Jun 24, 2009 1:10 PM by Randolf Geist Go to original post RSS
      • 15. Re: Latch problem.
        585227
        Thanks a lot Amitzi.

        Thanks,

        Rana.
        • 16. Re: Latch problem.
          Randolf Geist
          user582224 wrote:
          I bit description at the background is. Since this application uses lot of literal values a year back I'd suggested and changed cursor_sharing to similar. Since then there was improvement of around 25%.

          Though I've cursor_sharing as similar, what I'm confused is about Library cache latches without having many hard parses.
          Rana,

          if you're using CURSOR_SHARING=SIMILAR it might be worth to check if you have SQLs with a very high version count due to unshared child cursors because of literals marked as "unsafe" to share, which is usually caused by histograms on columns or the usage of non-equality predicates (eg >, <, >=, <=, LIKE) in the queries.

          You could use e.g. the V$SQLSTATS dynamic performance view (less latching than e.g. V$SQLAREA) to find the SQLs with the highest version count.

          If you encounter very high version counts this might be another reason why you have library cache latch contention issues (although, as Jonathan has pointed out, you seem to have more significant issues than the library cache latch contention).

          In order to address that you need to check why you get these high version counts which might be expected behaviour with CURSOR_SHARING=SIMILAR.

          If the issue is caused by histograms you could check if these histograms are actually helpful, i.e. do you get different execution plans and do they perform significantly different. You might be able to get rid of some of the histograms (using appropriate DBMS_STATS calls with METHOD_OPT parameter set to "FOR COLUMNS SIZE 1 <column_list_separated_by_comma_to_remove_the_histogram_from>") without affecting the performance, or you might consider to use CURSOR_SHARING=FORCE instead (which of course might introduce other issues).

          I would be careful with the recommended CURSOR_SPACE_FOR_TIME setting, I doubt that it will make a significant difference in your situation. The increased SESSION_CACHED_CURSORS setting might help although to reduce the parse rate.

          Regards,
          Randolf

          Oracle related stuff blog:
          http://oracle-randolf.blogspot.com/

          SQLTools++ for Oracle (Open source Oracle GUI for Windows):
          http://www.sqltools-plusplus.org:7676/
          http://sourceforge.net/projects/sqlt-pp/
          • 17. Re: Latch problem.
            585227
            Hi Randolf,

            Thanks for your input. Actually I has CURSOR_SHARING to EXACT until last six month which I changed to SIMILAR due to high hard parsing. The issue is addressed but brought new problem of excessive soft parse.

            I'm bit reluctant to user FORCE as I've seen unusual behaviour with some of my application (for other databases).

            Thanks,

            Rana.
            • 18. Re: Latch problem.
              26741
              I wouldn't advice CURSOR_SPACE_FOR_TIME. See MetaLink Note#6696453.8
              • 19. Re: Latch problem.
                Randolf Geist
                user582224 wrote:
                Thanks for your input. Actually I has CURSOR_SHARING to EXACT until last six month which I changed to SIMILAR due to high hard parsing. The issue is addressed but brought new problem of excessive soft parse.

                I'm bit reluctant to user FORCE as I've seen unusual behaviour with some of my application (for other databases).
                Rana,

                I'm not sure if you actually got my point: It was not about the soft parse issue per se, but the increased CPU usage and latch contention due to holding the latch longer caused by high version counts which can happen with CURSOR_SHARING=SIMILAR.

                So this is merely about reducing the CPU usage and latch contention caused by the soft parses, it doesn't address the soft parse rate itself.

                Your main issue seems to be the excessive number of executions per second/transaction, addressing this will reduce the (soft) parse rate and therefore very likely solve your CPU and latch contention issues.

                If and how this can be addressed we can't tell with the information provided so far. Using higher SESSION_CACHED_CURSORS settings might help as a damage limitation (but not sure if at all and to what extent), and you can check your current cache hits against the number of parse calls using some queries, see e.g. Metalink note 208857.1 and 208918.1.

                Regards,
                Randolf

                Oracle related stuff blog:
                http://oracle-randolf.blogspot.com/

                SQLTools++ for Oracle (Open source Oracle GUI for Windows):
                http://www.sqltools-plusplus.org:7676/
                http://sourceforge.net/projects/sqlt-pp/
                • 20. Re: Latch problem.
                  26741
                  To add to Randolf's notes, see MetaLink Notes #261020.1 and #377847.1
                  • 21. Re: Latch problem.
                    585227
                    Thanks Randolf and Hemant for all the help.

                    Thanks,

                    Rana.
                    • 22. Re: Latch problem.
                      Jonathan Lewis
                      Liron Amitzi wrote:

                      There is a relation to parse here. You are right about hard parse (there are not that many), but there are many soft parses.
                      You can see that most of the latch contetion is on library cache latch (which is used in soft parse - to search the library cache for the cursor).

                      I had several cases in which many soft parses caused high library cache latch contention and this caused the server to hang with 100% CPU.
                      Liron,

                      Depending on the pattern of code usage and version of Oracle - a "soft parse" which turns into a simple search of the library cache may turn into 2 gets on the library cache latch. An execution may also turn into two gets on the library cache latch. So the 450 parse calls per seocnd may be about 900 gets compared to the 5,500 executions per second being a possible 11,000 gets.

                      I too have seen cases where excessive parse calls - especially with cursor_sharing=similar have resulted in excessive latch contention and a signficant amount of CPU spent on latch spinning: but as I said, excessive CPU makes latching problems worse - beware of reversing cause and effect.

                      In this case you can see that Oracle is claiming that more than 99% of the CPU is non-parse CPU (not that I would trust any of the "instance efficiency percentages completely" until I had cross checked them in the "instance statistics"). This tends to suggest that the statistic on parse calls is a secondary issue.

                      To be continued - OTN is being very difficult at the moment (in my part of the world at least).

                      Regards
                      Jonathan Lewis
                      http://jonathanlewis.wordpress.com
                      http://www.jlcomp.demon.co.uk


                      "The saddest aspect of life right now is that science gathers knowledge faster than society gathers wisdom."
                      Isaac Asimov
                      • 23. Re: Latch problem.
                        700998
                        Are you using AIX?

                        It does memory management different that most unix versions that will cause memory swapping, if some parameters are not properly defined.

                        That will not have any effect on the issues going on in the database, but memory swapping is very bad and should be avoided in all cases.
                        • 24. Re: Latch problem.
                          Jonathan Lewis
                          WinRed

                          You've certainly pick up on an interesting oddity.

                          The block changes per second is about 5,000 - but the redo size is about 35K, an average of about 7 bytes per block change. But a very small change (e.g. update a single byte character column) should generate something like 100 bytes of redo.

                          So I don't want to comment on what the OP is doing with his executions until I see more information.

                          Regards
                          Jonathan Lewis
                          http://jonathanlewis.wordpress.com
                          http://www.jlcomp.demon.co.uk

                          To post code, statspack/AWR report, execution plans or trace files, start and end the section with the tag {noformat}
                          {noformat} (lowercase, curly brackets, no spaces) so that the text appears in 
                          fixed format
                          .
                          
                          "There's no sense in being precise when you don't even know what you're talking about"
                          John von Neumann                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        
                          • 25. Re: Latch problem.
                            Jonathan Lewis
                            Rana,
                            If you've set the sga_target and are letting Oracle resize the library cache and db cache size, then you can run into problems as a resize takes place. (And we can see that a few resize operations have taken place in the interval).
                            There is a sample query against v$sga_resize_ops on my blog (http://jonathanlewis.wordpress.com/2007/04/16/sga-resizing/) to list the type and frequency of resize operations. It's worth a quick check - and if it looks like the caches are constantly trading memory, you might want to fix the sizes and disable automatic memory management.

                            However - you have shown us a report for 7,200 seconds, using 3 CPUs (for a total of 21,600 CPU seconds) where your largest time component in Oracle is the CPU at roughly 68% usages.
                            I'd still like to see the OS Statistics (showing things like BUSY_TIME) to see how much CPU is going on outside Oracle because what we've seen so far is a couple of odd numbers, and a hint that you may be overloading the CPU.

                            Swap problems are not an obvious conclusion to jump to based on current data - the single block read times are a little slow (11 m/s) - but if you had serious swapping problems I think they could look a lot worse.

                            Can you give us a sample of the top few statements in "SQL ordered by CPU" and "SQL ordered by Executions"; it might be useful to see "SQL ordered by Version Count" as well (see Randolf's note).

                            Liron's question about session_cached_cursors may be of some help (but not critical) - but we can't really tell without first seeing the instance statistics

                            parse count
                            session cursor cache hits
                            session cursor cache count

                            As Liron has pointed out, "soft parses" are virtually free if the parse call uses the session cursor cache information to bypass the library cache search - but you don't appear to be using CPU on parsing.

                            Don't mess with the cursor_space_for_time parameter - that's something you play with only if you've got a very good application.

                            Regards
                            Jonathan Lewis
                            http://jonathanlewis.wordpress.com
                            http://www.jlcomp.demon.co.uk

                            To post code, statspack/AWR report, execution plans or trace files, start and end the section with the tag {noformat}
                            {noformat} (lowercase, curly brackets, no spaces) so that the text appears in 
                            fixed format
                            .
                            
                            "There's no sense in being precise when you don't even know what you're talking about"
                            John von Neumann                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                
                            • 26. Re: Latch problem.
                              Liron Amitzi
                              >
                              Jonathan Lewis wrote:

                              beware of reversing cause and effect
                              >

                              Hi Jonathan,
                              I didn't reverse the cause and effect, I meant exactly what I said.
                              Latch is a mechanism to protect internal memory structures. When a process tries to get a latch and it is already locked, the process goes into a busy loop to try to get the latch again, after certain amount of retries (several hundreds AFAIK) it sleeps and then goes into the busy loop again. This causes a very high CPU usage.
                              You are right that if the CPU is busy, latch problems can raise. But if you have latch problems you will have high CPU as well.
                              This specific problem goes both ways.

                              Liron Amitzi
                              Senior DBA consultant
                              [www.dbsnaps.com]
                              [www.orbiumsoftware.com]
                              • 27. Re: Latch problem.
                                Jonathan Lewis
                                Liron Amitzi wrote:

                                I didn't reverse the cause and effect, I meant exactly what I said.
                                That's what I thought, and that's why I warned you about checking cause and effect in this case.
                                Latch is a mechanism to protect internal memory structures. When a process tries to get a latch and it is already locked, the process goes into a busy loop to try to get the latch again, after certain amount of retries (several hundreds AFAIK) it sleeps and then goes into the busy loop again. This causes a very high CPU usage.
                                I think that's fairly well known by now, although it's less commonly appreciated that you have a CPU problem due to latching even if you don't see any sleeps.
                                You are right that if the CPU is busy, latch problems can raise. But if you have latch problems you will have high CPU as well.
                                This specific problem goes both ways.
                                Indeed, and that's why I was at pains to point out that you've described the traditional problem when the numbers suggest that it's not the latching that's the problem.

                                Obviously we need to see some underlying statistics - but if we assume that what we see is close to true:
                                CPU Time (seconds)               14,995
                                Percentage parse CPU               0.54%
                                Parse time CPU (derived seconds)     80.973
                                Parse time CPU / Parse time elapsed     8.09%
                                Parse time elapsed (derived seconds)     1,001
                                Latch sleep time (seconds)                5,076
                                It really seems unlikely that the parsing is a significant cause. It's more likely that the time lost on latching may be exacerbated by the excessive CPU usage elsewhere (although, as I've mentioned above, it's possible that there's a problem caused by dynamic resizing of the various caches).

                                Regards
                                Jonathan Lewis
                                http://jonathanlewis.wordpress.com
                                http://www.jlcomp.demon.co.uk

                                "For every expert there is an equal and opposite expert."
                                Arthur C. Clarke
                                • 28. Re: Latch problem.
                                  Randolf Geist
                                  user582224 wrote:
                                  Thanks Randolf and Hemant for all the help.
                                  Rana,

                                  since you seem to have a AWR license, what do the ADDM reports tell you?

                                  You can use $ORACLE_HOME/rdbms/admin/addmrpt.sql to generate a report for a pair of AWR snapshots or check the DBA_ADVISOR_FINDINGS/RECOMMENDATIONS views for the automatically generated ADDM reports (by default hourly).

                                  Regards,
                                  Randolf

                                  Oracle related stuff blog:
                                  http://oracle-randolf.blogspot.com/

                                  SQLTools++ for Oracle (Open source Oracle GUI for Windows):
                                  http://www.sqltools-plusplus.org:7676/
                                  http://sourceforge.net/projects/sqlt-pp/
                                  1 2 Previous Next