1 2 Previous Next 20 Replies Latest reply: Mar 11, 2011 2:15 PM by athompson88 Go to original post RSS
      • 15. Re: Excessive waits; latch: library cache + latch: shared pool
        mbobak
        It's cool. You don't know me from anyone. You don't want to go to the boss and say "Hey, I need a downtime and I'm going to make a major database server config change, I'm going to implement HugePages, all cause some guy on the internet told me to!" :-)

        I am curious though, I don't think you ever posted, what do you see in:
        grep HugePages /proc/meminfo
        When you have 700+ server processes?

        -Mark
        • 16. Re: Excessive waits; latch: library cache + latch: shared pool
          Jonathan Lewis
          athompson88 wrote:
          oh I'm convinced. I'm just doing my due diligence because if i'm going to go back to my boss and make this recommendation, I need to understand everything about what I'm recommending. That was why I was asking about the math. I found a reference to something that says the number might vary due to "lazy allocation" (?) of memory. In any case, I'm not being argumentative at this point and I apologize if it came off that way in my latest comment. Probably the result of several hours of looking at text logs and comparing memory allocations for several hundred processes in top outputs. :)
          A couple of points that might "fix" the maths - Mark suggested 4KB memory pages, but some operating systems use 8KB as their standard size memory page; that brings the theoretical figures closer to what you see. Then your observation about "lazy allocation" is relevant - unless you set the "pre_page_sga" parameter (and the name I've used may not be quite right) the processes won't build the entire map as they connect, they'll build it incrementally as they use the memory.

          Regards
          Jonathan Lewis
          • 17. Re: Excessive waits; latch: library cache + latch: shared pool
            athompson88
            Easily done. Here's a before/after from one of the "pauses" yesterday. Taken at 12:54 EST and 12:55 EST respectively. The memory dropped rapidly from about 1G free at 12:51 down to what you see here. I tried to track it back to an individual process and from there back to a SID, but no matter how many "pauses" I analyze, none of the SQL for the offending processes looks odd or out of place. I think we've just finally built up to needing more memory or to needing to use the existing memory more efficiently. Also during this time there were 950 active sessions on this node.

            [oracle@rac1 snaps]$ head -30 snapshot_201103101254.txt
            Meminfo
            -----------------------------------------------------
            MemTotal: 16414696 kB
            MemFree: 27716 kB
            Buffers: 744 kB
            Cached: 6937504 kB
            SwapCached: 117908 kB
            Active: 12272384 kB
            Inactive: 36924 kB
            HighTotal: 0 kB
            HighFree: 0 kB
            LowTotal: 16414696 kB
            LowFree: 27716 kB
            SwapTotal: 12586916 kB
            SwapFree: 11053324 kB
            Dirty: 0 kB
            Writeback: 1116 kB
            Mapped: 12288120 kB
            Slab: 224256 kB
            CommitLimit: 20794264 kB
            Committed_AS: 21862496 kB
            PageTables: 3747376 kB
            VmallocTotal: 536870911 kB
            VmallocUsed: 272204 kB
            VmallocChunk: 536598131 kB
            HugePages_Total: 0
            HugePages_Free: 0
            Hugepagesize: 2048 kB


            [oracle@rac1 snaps]$ head -30 snapshot_201103101255.txt
            Meminfo
            -----------------------------------------------------
            MemTotal: 16414696 kB
            MemFree: 1506156 kB
            Buffers: 3728 kB
            Cached: 6959688 kB
            SwapCached: 139864 kB
            Active: 10714736 kB
            Inactive: 92792 kB
            HighTotal: 0 kB
            HighFree: 0 kB
            LowTotal: 16414696 kB
            LowFree: 1506156 kB
            SwapTotal: 12586916 kB
            SwapFree: 10959276 kB
            Dirty: 492 kB
            Writeback: 8 kB
            Mapped: 10727228 kB
            Slab: 220336 kB
            CommitLimit: 20794264 kB
            Committed_AS: 20496268 kB
            PageTables: 3773840 kB
            VmallocTotal: 536870911 kB
            VmallocUsed: 272204 kB
            VmallocChunk: 536598131 kB
            HugePages_Total: 0
            HugePages_Free: 0
            Hugepagesize: 2048 kB
            • 18. Re: Excessive waits; latch: library cache + latch: shared pool
              athompson88
              Jonathan,

              I read about the pre_page_sga parameter and it's not in play here. To address your other point, we're using RHEL4 and we indeed have a 4K page size. I will say this. I've been a DBA for nearly 5 years (not a real long time I suppose), but I've only worked with RAC for the past 2 1/2, and this has been a great opportunity to learn about memory systems on a RAC vs a single-instance database. This has actually been a good exercise in learning a lot about the SGA and PGA in general.
              • 19. Re: Excessive waits; latch: library cache + latch: shared pool
                Jonathan Lewis
                athompson88 wrote:
                Jonathan,

                I read about the pre_page_sga parameter and it's not in play here. To address your other point, we're using RHEL4 and we indeed have a 4K page size. I will say this. I've been a DBA for nearly 5 years (not a real long time I suppose), but I've only worked with RAC for the past 2 1/2, and this has been a great opportunity to learn about memory systems on a RAC vs a single-instance database. This has actually been a good exercise in learning a lot about the SGA and PGA in general.
                You've probably come across this before - but Christo Kutrowski from Pythian did a good presentation on memory a couple of years ago at the UKOUG annual conference. A version of it is available on the Pythian site as a video with backing notes: http://www.pythian.com/blogs/741/pythian-goodies-free-memory-swap-oracle-and-everything

                Just catching up on alphabetti spaghetti - your question about ASM and huge pages and PGA.

                I think Mark was answering with respect to the ASM instance - but I think you may have been thinking of automatic system/SGA memory management, leading on to 11g's automatic memory management.

                Hugepages are (at least on some o/s) locked pages, and PGA can't use locked pages. In 10g you can setup automatic SGA memory management (the sga target size) with huges pages; but in 11g if you try to use automatic memory management (the memory target size - allowing movement of memory between SGA and PGA) then Oracle silently ignores the option to use hugepages. Someone wrote a blog about this recently - it may have been Kevin Closson.

                Regards
                Jonathan Lewis





                Regards
                Jonathan Lewis
                • 20. Re: Excessive waits; latch: library cache + latch: shared pool
                  athompson88
                  Jonathan,

                  No, I was referring to ASM, the instance, not the automatic memory management system. I'm aware of the limitations of AMM on 11g. We are using ASMM currently, but we'll be moving away from that when we upgrade to 11gR2 later this year to maintain support for hugepages. Especially since when we do the move we'll also be doing a hardware upgrade we'll probably jump to 32G/node as opposed to our current 16G/node. With that kind of jump we'll definitely want to be using hugepages.
                  1 2 Previous Next