7 Replies Latest reply: Jan 22, 2013 10:13 PM by 986533 RSS

    Can not Start RAC DB after system was rebooted

    986533
      Hello - I'm wondering if someone can help with this problem that I'm running into. It's a 11gR2 RAC setup with two nodes running on a 64 bit Linux. I am using DNS for VIP, SCAN except for the PRIV IP. The problem is this after rebooting the system I'm not able to start the DB. I am providing some details below that may or may not help but it's a start. Please ask if you need further information, appreciate if you can put details on where and how to collect that information to avoid any back and forth.

      ---------------


      [oracle@SC-RAC1 ~]$ id
      uid=1101(oracle) gid=1000(oinstall) groups=1000(oinstall),1200(dba),1201(oper),1300(asmdba)

      [oracle@SC-RAC1 ~]$ srvctl status asm
      ASM is running on sc-rac1,sc-rac2

      [oracle@SC-RAC1 ~]$ srvctl status listener
      Listener LISTENER is enabled
      Listener LISTENER is running on node(s): sc-rac1,sc-rac2

      [oracle@SC-RAC1 ~]$ srvctl status scan_listener
      SCAN Listener LISTENER_SCAN1 is enabled
      SCAN listener LISTENER_SCAN1 is running on node sc-rac2
      SCAN Listener LISTENER_SCAN2 is enabled
      SCAN listener LISTENER_SCAN2 is running on node sc-rac1
      SCAN Listener LISTENER_SCAN3 is enabled
      SCAN listener LISTENER_SCAN3 is running on node sc-rac1

      [oracle@SC-RAC1 ~]$ srvctl config database -d ORCL
      Database unique name: ORCL
      Database name: ORCL
      Oracle home: /u01/app/oracle/product/11.2.0/db_1
      Oracle user: oracle
      Spfile: +DATA/ORCL/spfileORCL.ora
      Domain:
      Start options: open
      Stop options: immediate
      Database role: PRIMARY
      Management policy: AUTOMATIC
      Server pools: ORCL
      Database instances: ORCL1,ORCL2
      Disk Groups: DATA,FRA
      Services:
      Database is administrator managed

      [oracle@SC-RAC1 ~]$ srvctl status vip -n sc-rac1
      VIP sc-rac1-vip is enabled
      VIP sc-rac1-vip is running on node: sc-rac1

      [oracle@SC-RAC1 ~]$ srvctl status vip -n sc-rac2
      VIP sc-rac2-vip is enabled
      VIP sc-rac2-vip is running on node: sc-rac2


      [oracle@SC-RAC1 ~]$ srvctl status database -d ORCL
      Instance ORCL1 is not running on node sc-rac1
      Instance ORCL2 is not running on node sc-rac2


      [oracle@SC-RAC1 ~]$ srvctl start database -d ORCL
      PRCR-1079 : Failed to start resource ora.orcl.db
      ORA-01034: ORACLE not available
      ORA-27101: shared memory realm does not exist
      Linux-x86_64 Error: 2: No such file or directory
      Process ID: 0
      Session ID: 0 Serial number: 0

      ORA-03113: end-of-file on communication channel
      Process ID: 7038
      Session ID: 1 Serial number: 3

      CRS-2674: Start of 'ora.orcl.db' on 'sc-rac1' failed
      ORA-01034: ORACLE not available
      ORA-27101: shared memory realm does not exist
      Linux-x86_64 Error: 2: No such file or directory
      Process ID: 0
      Session ID: 0 Serial number: 0

      CRS-2632: There are no more servers to try to place resource 'ora.orcl.db' on that would satisfy its placement policy
      ORA-01034: ORACLE not available
      ORA-27101: shared memory realm does not exist
      Linux-x86_64 Error: 2: No such file or directory
      Process ID: 0
      Session ID: 0 Serial number: 0

      ORA-03113: end-of-file on communication channel
      Process ID: 5837
      Session ID: 1 Serial number: 3

      CRS-2674: Start of 'ora.orcl.db' on 'sc-rac2' failed
      ORA-01034: ORACLE not available
      ORA-27101: shared memory realm does not exist
      Linux-x86_64 Error: 2: No such file or directory
      Process ID: 0
      Session ID: 0 Serial number: 0

      -----------------
      [oracle@SC-RAC1 ~]$ env | grep ORACLE
      ORACLE_UNQNAME=ORCL
      ORACLE_SID=ORCL1
      ORACLE_BASE=/u01/app/oracle
      ORACLE_HOME=/u01/app/oracle/product/11.2.0/db_1
      [oracle@SC-RAC1 ~]$


      The "oracle" user profile from the sc-rac1 node is also pasted;

      --------------

      [oracle@SC-RAC1 ~]$ more .bash_profile
      # .bash_profile

      # Get the aliases and functions
      if [ -f ~/.bashrc ]; then
      . ~/.bashrc
      fi

      # User specific environment and startup programs

      PATH=$PATH:$HOME/bin

      export PATH

      export EDITOR=vi
      export ORACLE_HOME=/u01/app/oracle/product/11.2.0/db_1
      export ORACLE_BASE=/u01/app/oracle
      export PATH=$PATH:$ORACLE_HOME/bin:$ORACLE_HOME/OPatch
      export ORACLE_SID=ORCL1
      export ORACLE_UNQNAME=ORCL
      export LD_LIBRARY_PATH=$ORACLE_HOME/lib

      [oracle@SC-RAC1 ~]$

      --------------------


      Any direction to resolve this is highly appreciated.

      Thanks,
        • 1. Re: Can not Start RAC DB after system was rebooted
          sb92075
          what additional clues exist within alert_SID.log file for each instance?
          • 2. Re: Can not Start RAC DB after system was rebooted
            986533
            Good Point. Did a tail on the log file and then executed the srvctl start database -d ORCL again.

            It show the following problem;

            ------------------------

            Errors in file /u01/app/oracle/diag/rdbms/orcl/ORCL1/trace/ORCL1_ora_9903.trc:
            ORA-19815: WARNING: db_recovery_file_dest_size of 4070572032 bytes is 100.00% used, and has 0 remaining bytes available.
            ************************************************************************
            You have following choices to free up space from recovery area:
            1. Consider changing RMAN RETENTION POLICY. If you are using Data Guard,
            then consider changing RMAN ARCHIVELOG DELETION POLICY.
            2. Back up files to tertiary device such as tape using RMAN
            BACKUP RECOVERY AREA command.
            3. Add disk space and increase db_recovery_file_dest_size parameter to
            reflect the new space.
            4. Delete unnecessary files using RMAN DELETE command. If an operating
            system command was used to delete files, then use RMAN CROSSCHECK and
            DELETE EXPIRED commands.
            ************************************************************************
            Errors in file /u01/app/oracle/diag/rdbms/orcl/ORCL1/trace/ORCL1_ora_9903.trc:
            ORA-19809: limit exceeded for recovery files
            ORA-19804: cannot reclaim 44040192 bytes disk space from 4070572032 limit
            ARCH: Error 19809 Creating archive log file to '+FRA'
            Errors in file /u01/app/oracle/diag/rdbms/orcl/ORCL1/trace/ORCL1_ora_9903.trc:
            ORA-16038: log 1 sequence# 39 cannot be archived
            ORA-19809: limit exceeded for recovery files
            ORA-00312: online log 1 thread 1: '+DATA/orcl/onlinelog/group_1.261.803659781'
            ORA-00312: online log 1 thread 1: '+FRA/orcl/onlinelog/group_1.257.803659781'
            USER (ospid: 9903): terminating the instance due to error 16038
            Tue Jan 22 22:00:13 2013
            System state dump is made for local instance
            System State dumped to trace file /u01/app/oracle/diag/rdbms/orcl/ORCL1/trace/ORCL1_diag_9783.trc
            Trace dumping is performing id=[cdmp_20130122220013]
            Instance terminated by USER, pid = 9903
            --------------------------

            Can you please tell me how to fix this and avoid it to full again. Sorry if it was an obvious question but all I have done is the setup of RAC with one DB and sample schema and nothing else and in 2 weeks its full :(
            • 3. Re: Can not Start RAC DB after system was rebooted
              sb92075
              You have following choices to free up space from recovery area:
              1. Consider changing RMAN RETENTION POLICY. If you are using Data Guard,
              then consider changing RMAN ARCHIVELOG DELETION POLICY.
              2. Back up files to tertiary device such as tape using RMAN
              BACKUP RECOVERY AREA command.
              3. Add disk space and increase db_recovery_file_dest_size parameter to
              reflect the new space.
              4. Delete unnecessary files using RMAN DELETE command. If an operating
              system command was used to delete files, then use RMAN CROSSCHECK and
              DELETE EXPIRED commands.
              • 4. Re: Can not Start RAC DB after system was rebooted
                986533
                Is there a command or set of commands I can run to stop this from happening again?

                right now it is set to;

                SQL> startup mount
                /
                ORACLE instance started.

                Total System Global Area 1068937216 bytes
                Fixed Size 2220200 bytes
                Variable Size 817893208 bytes
                Database Buffers 243269632 bytes
                Redo Buffers 5554176 bytes
                Database mounted.
                SQL> SP2-0103: Nothing in SQL buffer to run.
                SQL> show parameter DB_RECOVERY_FILE_DEST_SIZE

                NAME TYPE VALUE
                ------------------------------------ ----------- ------------------------------
                db_recovery_file_dest_size big integer 3882M


                So I can extend the space and this will push the limit but my question is how to avoid this from happening again.

                I guess I can go ahead and do further reading on it if the answer is not straight forward?
                • 5. Re: Can not Start RAC DB after system was rebooted
                  sb92075
                  When a database has Archive REDO enabled, for your version the Flashback Recovery Area (FRA) is where REDO log files get archived to.

                  Depending upon how you manage your DB backups, the archived REDO files may be needed for DB Recovery.

                  YOU need to manage the DB's disk space; including the FRA along with ensuring you can recover the DB after a complete system failure.
                  • 6. Re: Can not Start RAC DB after system was rebooted
                    986533
                    I understand your view but for my purpose I really don't care about the backup and other best practices. It's a test environment which I won't be using once I am able to complete the integration testing which requires the RAC. I must admit, I'm not a DB hence asking perhaps simple questions that you may expect I should know.

                    So I think I can just go ahead and disable archive logging.

                    If you have any quick fix other then sending me to the RMAN and backup strategy route then please let me know or I can mark this questions answer as is.

                    Thanks again.
                    • 7. Re: Can not Start RAC DB after system was rebooted
                      986533
                      Okay. So here is what I did in case someone in future looking for a quick fix to move on, as I stated earlier, I really don't care about the backup...

                      RMAN > delete archivelog all;

                      after that alert_sid.log showed;

                      db_recovery_file_dest_size of 3882 MB is 5.77% used

                      Then DB started fine after that;

                      [oracle@SC-RAC1 ~]$ srvctl start database -d ORCL
                      [oracle@SC-RAC1 ~]$ srvctl status database -d ORCL
                      Instance ORCL1 is running on node sc-rac1
                      Instance ORCL2 is running on node sc-rac2
                      [oracle@SC-RAC1 ~]$

                      I also also going to disable the archive mode;

                      ALTER DATABASE NOARCHIVELOG;

                      So here you have an ugly fix but if want to do it properly then listen to the experts on this forum. You can start with this link;

                      http://docs.oracle.com/cd/E11882_01/backup.112/e10642/rcmconfb.htm#CHDCFHBG