2 Replies Latest reply on Jan 21, 2014 4:07 AM by mbobak

    DB issue because of HUGE Page Parameter in LINUX

    summity.gupta

      We recently did an Oracle DB upgrade and migrated the database from Unix to Linux for Siebel. Couple of days after, there was a big performance issue in Production environment which brought down the database. Client DBA team identified the problem as HUGE page parameter being off. Once the huge page parameter in Linux was turned on, issue got resolved.

       

      However, we never encountered the issue in other environments with HUGE page parameter turned off, not even during the performance testing. the issue should have occurred in We reran the performance testing with rigorous load in performance test environment with similar to Production database size and HUGE page parameter off, but still the environment was stable.  We think that we might be missing something important in performance testing.

       

      It will be a great help if you could provide some insight on what could be the cause.

      We checked # of insert/update/delete statements in performance and production environment were almost same.

      # of database connections in performance environment were less compared to production environment. Could that be the cause?

        • 1. Re: DB issue because of HUGE Page Parameter in LINUX
          Hemant K Chitale

          Huge Pages reduces the size of the PageTable.  The PageTable can be very large for a large SGA combined with many concurrent connections (because each connection needs to map to the SGA).  Therefore, a significant difference in the number of concurrent connections can increase the size of the PageTable when not using Huge Pages.

           

          I am NOT asserting that the performance problem was due to the lack of Huge Pages configured.  It may have been so.

           

          See Oracle Support note "HugePages on Linux: What It Is... and What It Is Not... (Doc ID 361323.1)"

           

           

          Hemant K Chitale

          • 2. Re: DB issue because of HUGE Page Parameter in LINUX
            mbobak

            Summity Gupta wrote:

             

            We recently did an Oracle DB upgrade and migrated the database from Unix to Linux for Siebel. Couple of days after, there was a big performance issue in Production environment which brought down the database. Client DBA team identified the problem as HUGE page parameter being off. Once the huge page parameter in Linux was turned on, issue got resolved.

             

            However, we never encountered the issue in other environments with HUGE page parameter turned off, not even during the performance testing. the issue should have occurred in We reran the performance testing with rigorous load in performance test environment with similar to Production database size and HUGE page parameter off, but still the environment was stable.  We think that we might be missing something important in performance testing.

             

            It will be a great help if you could provide some insight on what could be the cause.

            We checked # of insert/update/delete statements in performance and production environment were almost same.

            # of database connections in performance environment were less compared to production environment. Could that be the cause?

            Couple of points:  First, I'm going to take a stronger position on hugepages than Hemant.  I'm going to go as fas as to say, for any non-trivial SGA size, if you're *not* using hugepages, you're doing it wrong.  Second, your question/comment about having fewer connections in the performance testing environment than production, well, there's no way to tell for sure, but yes, probably.

             

            Let me expound a bit more on hugepages.  There are three main points to consider.  First, when allocating a large SGA, each page needs an entry in the pagetable.  On 64-bit architecture, each PTE (page table entry) is 8 bytes.  So, let's assume you are allocating a 20GB SGA.  Standard shared memory pages are 4kb.  So, to allocate 20GB of SGA, you need 5,242,880 4kb pages.  Each of those pages requires an 8-byte entry in the page table.  So, that's 40MB of page table entries.

             

            The second point, with standard 4k pages, each process that attaches to the SGA needs it's own copy of the page table.  So, if you have, say, 200 dedicated server processes, that's implies 8000MB of page table entries.  So, for a 20GB SGA, you have an extra 8GB (approximately) of overhead.

             

            Compare that to the same 20GB SGA, implemented with hugepages.  20GB, using 2MB sized hugepages, means 10,240 pages, and correspondingly, 10,240 page table entries.  Again, a page table entry is 8 bytes.  So, for the same 20GB SGA, the page table overhead is only 80kb.  However, the other big savings, is that with hugepages, the page table is *shared*.  So, you only need that one copy of the 80kb page table for your 20GB SGA, regardless of how many dedicated server processes attach to the SGA!

             

            Finally, hugepages are locked into memory, and cannot swap.  So, that also can add significantly to stability.

             

            So, depending on the size of your SGA, the number of dedicated server processes, and the total amount of RAM on the server, yes, without hugepages configured, it's certainly possible that the reduced number of connections during your performance test, is the reason that you didn't suffer the type of performance problems you did on the production system.

             

            As I said before, if you're not using hugepages, you're doing it wrong!

             

            Hope that helps,

             

            -Mark