4 Replies Latest reply: Apr 26, 2012 3:53 PM by 745782 RSS

    Whis is rman caching all files to memory during backup

    745782
      I've 3 RAC environments on Linux 5.5 with Oracle 11.2.0.2 and ASM; 1 RAC environment has Automatic Memory Management (AMM with 32G MEMORY_TARGET and MEMORY_MAX_TARGET) and 2 RAC environments have Automatic Shared Memory Management (ASMM with 16G SGA_TARGET, SGA_MAX_TARGET, PGA of 5G) Each environment has 2 servers with 132G memory each. I backup databases on the 1st server of each environment locally. The problem I'm facing is that whenever I run rman, all the rman files are cached to memory while backing up to disk; and the memory cached is never released later. It comes to a point that after a few days of rman run, there will be no free memory left out of 132G total and starts taking SWAP memory (18G). Once half the swap is taken, the db connections will be refused unless I drop the cache. This behavior is strange; oracle says it's OS issue; Linux says it's Oracle's issue. I'll give an example below:

      1st time backup today:

      Before backup:

      top - 06:29:15 up 11 days, 16:59, 2 users, load average: 0.52, 0.78, 0.78
      Mem: 132081600k total, 42181080k used, 89900520k free, 540752k buffers
      Swap: 17825768k total, 372k used, 17825396k free, 37775732k cached

      After backup:

      top - 06:36:06 up 11 days, 17:06, 2 users, load average: 1.92, 1.87, 1.29
      Mem: 132081600k total, 44605392k used, 87476208k free, 552200k buffers
      Swap: 17825768k total, 372k used, 17825396k free, 40080044k cached


      2nd time backup today:

      Before backup:

      top - 06:46:28 up 11 days, 17:16, 2 users, load average: 0.75, 0.98, 1.07
      Mem: 132081600k total, 44551264k used, 87530336k free, 555228k buffers
      Swap: 17825768k total, 372k used, 17825396k free, 40083352k cached


      After backup:

      top - 06:52:18 up 11 days, 17:22, 2 users, load average: 2.23, 1.69, 1.36
      Mem: 132081600k total, 46813808k used, 85267792k free, 559936k buffers
      Swap: 17825768k total, 372k used, 17825396k free, 42268728k cached



      3rd time backup:

      Before backup:

      top - 07:13:03 up 11 days, 17:42, 2 users, load average: 0.45, 0.73, 0.88
      Mem: 132081600k total, 46818456k used, 85263144k free, 565864k buffers
      Swap: 17825768k total, 372k used, 17825396k free, 42275028k cached


      After backup:

      top - 07:19:43 up 11 days, 17:49, 2 users, load average: 1.10, 1.43, 1.17
      Mem: 132081600k total, 49080992k used, 83000608k free, 571088k buffers
      Swap: 17825768k total, 372k used, 17825396k free, 44462220k cached


      One more interesting thing is, if I delete the old backups whether using rman or simply by 'rm -rf' command, the memory released from cache to free would be exactly equal to the size of the deleted rman files as below (free memory increased to 24G from 9G; cache memory decreased from 111G to 97G):

      top - 07:06:55 up 188 days, 19:51, 2 users, load average: 0.77, 0.76, 0.65
      Mem: 132081600k total, 122547932k used, 9533668k free, 2153420k buffers
      Swap: 17825768k total, 0k used, 17825768k free, 111547704k cached

      rm -rf df_l0_NGSQA01_1321_1
      rm -rf df_l0_NGSQA01_1322_1
      rm -rf al_NGSQA01_1323_1
      rm -rf al_NGSQA01_1324_1
      rm -rf al_NGSQA01_1325_1
      rm -rf al_NGSQA01_1326_1
      rm -rf al_NGSQA01_1327_1
      rm -rf al_NGSQA01_1328_1
      rm -rf al_NGSQA01_1329_1
      rm -rf al_NGSQA01_1330_1
      rm -rf al_NGSQA01_1331_1
      rm -rf al_NGSQA01_1332_1

      top - 07:07:53 up 188 days, 19:52, 2 users, load average: 0.54, 0.69, 0.63
      Mem: 132081600k total, 108021332k used, 24060268k free, 2153432k buffers
      Swap: 17825768k total, 0k used, 17825768k free, 97254512k cached

      Can anybody please explain to me how to correct this issue?

      Thanks
      Satish

      Edited by: bgs on Apr 3, 2012 11:48 AM
        • 1. Re: Whis is rman caching all files to memory during backup
          LKBrwn_DBA
          Wrong: it is NOT rman caching the files, it's Redhat.

          In our case Redhat was caching all the I/O from the SAN and was completely slowing down the DB.
          Cause: The Sys admins had miss-configured the cache.

          Good luck!
          :p
          • 2. Re: Whis is rman caching all files to memory during backup
            745782
            Thanks, LKbrwn DBA! I've told my Linux Admin team to take a look into that and if possible open a ticket with RedHat. Oracle support has been insisting that it's not Oracle's problem. I've hit a roadblock during the past 20 days unable to solve this issue.

            Satish...
            • 3. Re: Whis is rman caching all files to memory during backup
              912595
              bgsatish wrote:
              Thanks, LKbrwn DBA! I've told my Linux Admin team to take a look into that and if possible open a ticket with RedHat. Oracle support has been insisting that it's not Oracle's problem. I've hit a roadblock during the past 20 days unable to solve this issue.

              Satish...
              Take a note- Oracle RMAN process will keep files in filesystem and once backup is finished RMAN will release the channel so rlease its processes. So it would be OS issue these these backups are placed in filessystem cache and from there these are not release (as RMAN already release and forgot)
              • 4. Re: Whis is rman caching all files to memory during backup
                745782
                I had tried setting several kernel parameters and one of the hidden backup parameters suggested by Oracle support to change it. However, found the solution by setting the db parameter FILESYSTEMIO_OPTIONS from 'none' to 'SETALL'. It worked so wonderfully that I tested this on two databases by changing this parameter back and forth and there is difference of day and night. This parameter requires a db restart. I am giving some proofs below:

                ----------------------------------------------------------------------------------------------------------------
                Setting the parameter to NONE:

                Before rman backup:

                "top - 11:01:10 up 1 day, 20:38, 2 users, load average: 0.69, 0.76, 0.64
                Mem: 131950528k total, 37807788k used, 94142740k free, 584312k buffers
                Swap: 20971512k total, 0k used, 20971512k free, 35219992k cached

                After rman backup:

                top - 11:05:30 up 1 day, 20:43, 2 users, load average: 1.35, 1.21, 0.85
                Mem: 131950528k total, 38852068k used, 93098460k free, 586248k buffers
                Swap: 20971512k total, 0k used, 20971512k free, 36203412k cached

                About 1.05G of free memory consumed"
                -----------------------------------------------------------------------------------------------------------------
                Setting the parameter to SETALL:

                Before Rman backup:

                "top - 11:23:10 up 1 day, 21:00, 2 users, load average: 0.28, 0.63, 0.71
                Mem: 131950528k total, 38697084k used, 93253444k free, 588180k buffers
                Swap: 20971512k total, 0k used, 20971512k free, 36196732k cached

                After Rman backup:

                top - 11:31:48 up 1 day, 21:09, 2 users, load average: 1.05, 0.86, 0.81
                Mem: 131950528k total, 38702956k used, 93247572k free, 590764k buffers
                Swap: 20971512k total, 0k used, 20971512k free, 36179784k cached

                Only about 6M of free memory consumed
                ---------------------------------------------------------------------------------------------------------------
                Setting the parameter to NONE:

                Before Rman backup:

                "top - 11:43:47 up 2 days, 21:21, 3 users, load average: 0.49, 0.60, 0.53
                Mem: 131950528k total, 39021576k used, 92928952k free, 701420k buffers
                Swap: 20971512k total, 0k used, 20971512k free, 36366412k cached

                After Rman backup:

                top - 11:50:00 up 2 days, 21:27, 4 users, load average: 2.15, 1.28, 0.79
                Mem: 131950528k total, 40200656k used, 91749872k free, 703404k buffers
                Swap: 20971512k total, 0k used, 20971512k free, 37483076k cached

                About 1.18G of free memory consumed."
                --------------------------------------------------------------------------------------------------------------
                Setting the parameter to SETALL:

                Before Rman backup:

                "top - 12:02:33 up 2 days, 21:40, 4 users, load average: 0.53, 0.83, 0.77
                Mem: 131950528k total, 40176156k used, 91774372k free, 704296k buffers
                Swap: 20971512k total, 0k used, 20971512k free, 37481212k cached

                After Rman backup:

                top - 12:08:04 up 2 days, 21:45, 4 users, load average: 2.10, 1.49, 1.04
                Mem: 131950528k total, 40184836k used, 91765692k free, 706208k buffers
                Swap: 20971512k total, 0k used, 20971512k free, 37463724k cached

                Only about 8.5M of free memory consumed"
                --------------------------------------------------------------------------------------------------------------

                The databases are are on RAC and datafiles are on ASM file system (raw SAN device). Setting this parameter has also helped the day-to-day memory caching that was quite high to minimal.

                Satish...