Database Crashing Due To Memory Issues

3625952

    Hello everyone!!  We are having major problems with one of our production database that we recently migrated from 10.2.0.2 on a Suse box to Oracle 12.1.0.2 on a CentOS server.  Since the migration, our database crashes daily.  The current CentOS server is a VM, with 4G of memory.  This is a very lightly used database.  The server chugs along fine, and then after 12 hours or so, we start to see these errors in the alert log:

     

    Thu Feb 01 04:51:18 2018

    Process m001 died, see its trace file

    Process m000 died, see its trace file

    Process m001 died, see its trace file

    Thu Feb 01 05:03:47 2018

    Process J004 died, see its trace file

    Thu Feb 01 05:03:47 2018

    kkjcre1p: unable to spawn jobq slave process

     

     

    The database eventually crashes.  Here are some of our database parameters:

     

    memory_max_target                    big integer 3008M

    memory_target                        big integer 3008M

    parallel_servers_target              integer     10

    pga_aggregate_target                 big integer 30M

    sga_target                           big integer 0

    java_pool_size                       big integer 0

    large_pool_size                      big integer 0

    olap_page_pool_size                  big integer 0

    shared_pool_reserved_size            big integer 32715571

    shared_pool_size                     big integer 0

    streams_pool_size                    big integer 608M

     

    Do you think we should get away from AMM, and set the pga and sga manually?  Any help would be greatly appreciated.  Thanks so much.

      • 1. Re: Database Crashing Due To Memory Issues
        EdStevens

        3625952 wrote:

         

        Hello everyone!! We are having major problems with one of our production database that we recently migrated from 10.2.0.2 on a Suse box to Oracle 12.1.0.2 on a CentOS server. Since the migration, our database crashes daily. The current CentOS server is a VM, with 4G of memory. This is a very lightly used database. The server chugs along fine, and then after 12 hours or so, we start to see these errors in the alert log:

         

        Thu Feb 01 04:51:18 2018

        Process m001 died, see its trace file

        Process m000 died, see its trace file

        Process m001 died, see its trace file

        Thu Feb 01 05:03:47 2018

        Process J004 died, see its trace file

        Thu Feb 01 05:03:47 2018

        kkjcre1p: unable to spawn jobq slave process

         

         

        The database eventually crashes. Here are some of our database parameters:

         

        memory_max_target big integer 3008M

        memory_target big integer 3008M

        parallel_servers_target integer 10

        pga_aggregate_target big integer 30M

        sga_target big integer 0

        java_pool_size big integer 0

        large_pool_size big integer 0

        olap_page_pool_size big integer 0

        shared_pool_reserved_size big integer 32715571

        shared_pool_size big integer 0

        streams_pool_size big integer 608M

         

        Do you think we should get away from AMM, and set the pga and sga manually? Any help would be greatly appreciated. Thanks so much.

        My first question is why are you running on an OS that is not certified by Oracle?

        • 2. Re: Database Crashing Due To Memory Issues
          3625952

          I wish I had a good answer for that.  It came from the top.  Something about cutting costs etc. 

          • 3. Re: Database Crashing Due To Memory Issues
            SeánMacGC

            As Ed says, it's incomprehsible that you'd run any Production system on a non-certified platform.


            Aside from that, yes, I'd be inclined not to use AMM (especially for a Production system).

            • 4. Re: Database Crashing Due To Memory Issues
              John Thorton

              3625952 wrote:

               

              Hello everyone!! We are having major problems with one of our production database that we recently migrated from 10.2.0.2 on a Suse box to Oracle 12.1.0.2 on a CentOS server. Since the migration, our database crashes daily. The current CentOS server is a VM, with 4G of memory. This is a very lightly used database. The server chugs along fine, and then after 12 hours or so, we start to see these errors in the alert log:

               

              Thu Feb 01 04:51:18 2018

              Process m001 died, see its trace file

              Process m000 died, see its trace file

              Process m001 died, see its trace file

              Thu Feb 01 05:03:47 2018

              Process J004 died, see its trace file

              Thu Feb 01 05:03:47 2018

              kkjcre1p: unable to spawn jobq slave process

               

               

              The database eventually crashes. Here are some of our database parameters:

               

              memory_max_target big integer 3008M

              memory_target big integer 3008M

              parallel_servers_target integer 10

              pga_aggregate_target big integer 30M

              sga_target big integer 0

              java_pool_size big integer 0

              large_pool_size big integer 0

              olap_page_pool_size big integer 0

              shared_pool_reserved_size big integer 32715571

              shared_pool_size big integer 0

              streams_pool_size big integer 608M

               

              Do you think we should get away from AMM, and set the pga and sga manually? Any help would be greatly appreciated. Thanks so much.

               

               

               

              is database impacted by PROCESSES/SESSIONS parameters having too small value?

               

              Why was Oracle DB installed on unsupported OS?

              • 5. Re: Database Crashing Due To Memory Issues
                EdStevens

                3625952 wrote:

                 

                I wish I had a good answer for that. It came from the top. Something about cutting costs etc.

                "The top" needs to be informed that their production database is running on an uncertified platform and as a result you production database is unsupported.

                 

                As for cost, consider the following:

                1) The operating license for Oracle Linux is $0.  The cost of a minimal support contract is less than "The Top" spends on a single business lunch.

                2) What is the cost of a problem that requires assistance from Oracle Support, when you are unsupported?

                • 6. Re: Database Crashing Due To Memory Issues
                  JohnWatson2

                  Any number of people run on uncertified platforms, Amazon AWS or VMware for example. The CentOS kernel is the RedHat kernel (as is Oracle Enterprise Linux) so it really doesn't matter. Except for support: Oracle may, if they wish, say that the problems are yours. They don't do that very often, so you can raise a TAR and hope.

                   

                  However, in this case I think the problems may well be yours. I'm thinking kernel parameters and the like. You probably haven't run the oracle-rdbms-server-12cR1-preinstall.rpm and I don't know if you can on CentOS. Download it and try! Otherwise, better go through the requirements for kernel settings and shared memory and tmpfs and all the rest in great detail.

                   

                  I wouldn't think there is any problem with using AMM.

                   

                  Have you looked at the alert log and the trace files?

                  • 7. Re: Database Crashing Due To Memory Issues
                    Andris Perkons-Oracle

                    See this thread: Out-of-memory

                    There, the disabling of AMM has helped.

                     

                    The bug mentioned in that thread seems to be fixed with the 12.1.0.2.170418 Proactive BP.

                     

                    Andris

                    • 8. Re: Database Crashing Due To Memory Issues
                      3625952

                      Thanks so much for your replies.  I have read that it is probably related to that bug.  I guess the next probable thing to do is apply the patch, and see if that resolves it.  If not, I can disable AMM.  I totally agree with what everyone is saying.  We all questioned why we would ever move to CentOS, but the 'top' had already made their decision. 

                      • 9. Re: Database Crashing Due To Memory Issues
                        JohnWatson2

                        Why do you think your problem is anything to do with memory?

                        There is no reason not to apply that patch (except that you might as well apply the January one instead) but if I were you I would not stop researching it. You have not even mentioned any ORA- message yet.

                        • 10. Re: Database Crashing Due To Memory Issues
                          3625952

                          There are actually no ORA- messages in the alert log.  Everything looks completely normal, and out of the blue, we start seeing the j000 and m000 errors, and nothing else.  If I check the trace file associated with the errors, I see:

                           

                          *** 2018-02-01 22:00:02.056

                          *** SESSION ID:(787.4096) 2018-02-01 22:00:02.056

                          *** CLIENT ID:() 2018-02-01 22:00:02.056

                          *** SERVICE NAME:(SYS$BACKGROUND) 2018-02-01 22:00:02.056

                          *** MODULE NAME:() 2018-02-01 22:00:02.056

                          *** CLIENT DRIVER:() 2018-02-01 22:00:02.056

                          *** ACTION NAME:() 2018-02-01 22:00:02.056

                           

                           

                          Setting Resource Manager plan SCHEDULER[0x4445]:DEFAULT_MAINTENANCE_PLAN via scheduler window

                          Setting Resource Manager plan DEFAULT_MAINTENANCE_PLAN via parameter

                          ksktopplmod:message reply 7460

                          ksktopplmod:error[7460] is caught

                           

                           

                          *** 2018-02-02 03:18:47.179

                          Process J003 is dead (pid=13574 req_ver=789 cur_ver=789 state=KSOSP_SPAWNED).

                           

                           

                          *** 2018-02-02 03:33:47.210

                          Process J003 is dead (pid=14210 req_ver=808 cur_ver=808 state=KSOSP_SPAWNED).

                           

                           

                          *** 2018-02-02 03:48:47.201

                          Process J003 is dead (pid=14511 req_ver=834 cur_ver=834 state=KSOSP_SPAWNED).

                           

                           

                          *** 2018-02-02 04:03:47.200

                          Process J003 is dead (pid=15261 req_ver=857 cur_ver=857 state=KSOSP_SPAWNED).

                           

                           

                          *** 2018-02-02 04:18:47.201

                          Process J003 is dead (pid=15505 req_ver=878 cur_ver=878 state=KSOSP_SPAWNED).

                           

                           

                          *** 2018-02-02 04:19:04.409

                          Process J003 is dead (pid=15515 req_ver=879 cur_ver=879 state=KSOSP_SPAWNED).

                           

                           

                          *** 2018-02-02 04:33:47.233

                          Process J003 is dead (pid=16135 req_ver=885 cur_ver=885 state=KSOSP_SPAWNED).

                           

                           

                          *** 2018-02-02 04:48:47.233

                          Process J003 is dead (pid=16381 req_ver=896 cur_ver=896 state=KSOSP_SPAWNED).

                           

                           

                          *** 2018-02-02 05:03:47.213

                          Process J003 is dead (pid=17050 req_ver=909 cur_ver=909 state=KSOSP_SPAWNED).

                           

                           

                          *** 2018-02-02 05:17:11.657

                          Process J004 is dead (pid=17243 req_ver=637 cur_ver=637 state=KSOSP_SPAWNED).

                          • 11. Re: Database Crashing Due To Memory Issues
                            JohnWatson2

                            No problem there that I can see. You said "our database crashes daily". What does it do?

                            • 13. Re: Database Crashing Due To Memory Issues
                              Mark D Powell

                              user3625952, how large is the swap size?  Make sure it matches the Oracle install documentation requirement.

                              - -

                              Kkjcre1p: unable to spawn jobq slave process Happened Intermittently When Memory_target Is Set and Swap Size Is Not Big (Doc ID 2356025.1)

                              - -

                              What is your value for JOB_QUEUE_PROCESSES?

                              - -

                              HTH -- Mark D Powell --

                              • 14. Re: Database Crashing Due To Memory Issues
                                3625952

                                The best way for me to describe it is it is like the system is not releasing memory, and it builds up until it runs out.  There are no ORA- errors etc. The database just goes down.  When it does, and I try to start it again, I get this error:

                                 

                                07:31:49 SQL> startup

                                ORA-00845: MEMORY_TARGET not supported on this system

                                 

                                 

                                The job_queue_processes is:

                                job_queue_processes                  integer     500

                                 

                                 

                                df -h /dev/shm/

                                Filesystem      Size  Used Avail Use% Mounted on

                                tmpfs           4.0G  1.5G  2.6G  38% /dev/shm

                                 

                                 

                                free -g

                                              total        used        free      shared  buff/cache   available

                                Mem:              3           0           0           1           2           1

                                Swap:             1           0           1

                                 

                                 

                                Thanks so much for all of your help.

                                1 2 上一个 下一个