6 Replies Latest reply: Jun 12, 2014 2:30 AM by Kavanagh RSS

    Few Questions on Manual Recovery

    Kavanagh

      DB version: 11.2.0.3.0
      Platform : Oracle Linux 6.4

       

      I am not an expert in Recovery.

       

      We did a SAN level cloning of a RAC DB. We store Online Redo logs in RL_DG diskgroup and Archive Redo logs in ARCH_DG diskgroup.

      After the SAN level cloning (EMC BCV Synch ) , we tried to start up the DB using srvctl start database -d <MyDBName> and we got the following error.

       

      CRS-5017: The resource action "ora.msblprd.db start" encountered the following error:
      ORA-01113: file 4 needs media recovery
      ORA-01110: data file 4: '+MBL_DATA/msblprd/datafile/users.271.803502221'

       

      We realized that lots of datafiles (not justs users.dbf)  need recovery. So, we tried to run RECOVER DATABASE. But it errored out. So, we ran RECOVER DATABASE USING BACKUP CONTROLFILE UNTIL CANCEL ;  command.

       

      [oracle@mssrpt01dbp1001 ~]$ sqlplus / as sysdba

      SQL*Plus: Release 11.2.0.3.0 Production on Tue Jun 10 14:41:41 2014

      Copyright (c) 1982, 2011, Oracle.  All rights reserved.

      Connected to an idle instance.

      SQL> startup mount;
      ORACLE instance started.

      Total System Global Area 8.5516E+10 bytes
      Fixed Size                  2237776 bytes
      Variable Size            2.4428E+10 bytes
      Database Buffers         6.0935E+10 bytes
      Redo Buffers              151142400 bytes
      Database mounted.
      SQL> SELECT MEMBER FROM V$LOG G, V$LOGFILE F WHERE G.GROUP# = F.GROUP# AND G.STATUS = 'CURRENT';

      MEMBER
      --------------------------------------------------------------------------------
      +RL_DG/msblprd/onlinelog/group_3.258.834497269
      +RL_DG/msblprd/onlinelog/group_3.258.834497283
      +RL_DG/msblprd/onlinelog/group_13.262.834498753
      +RL_DG/msblprd/onlinelog/group_13.262.834498735
      +RL_DG/msblprd/onlinelog/group_32.269.834527443
      +RL_DG/msblprd/onlinelog/group_32.269.834527459
      +RL_DG/msblprd/onlinelog/group_44.267.834526221
      +RL_DG/msblprd/onlinelog/group_44.267.834526237

      8 rows selected.

      SQL>
      SQL>
      SQL> RECOVER DATABASE;
      ORA-00283: recovery session canceled due to errors
      ORA-01610: recovery using the BACKUP CONTROLFILE option must be done

      SQL> RECOVER DATABASE USING BACKUP CONTROLFILE UNTIL CANCEL ;
      ORA-00279: change 7996788033403 generated at 06/10/2014 04:57:12 needed for thread 2
      ORA-00289: suggestion : +ARCH_DG
      ORA-00280: change 7996788033403 for thread 2 is in sequence #16085


      Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
      +RL_DG/msblprd/onlinelog/group_3.258.834497269
      ORA-00325: archived log for thread 2, wrong thread # 1 in header
      ORA-00334: archived log: '+RL_DG/msblprd/onlinelog/group_3.258.834497269'


      ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
      ORA-01194: file 1 needs more recovery to be consistent
      ORA-01110: data file 1: '+SBL_DATA/msblprd/datafile/system.260.803502221'

      SQL> RECOVER DATABASE USING BACKUP CONTROLFILE UNTIL CANCEL ;
      ORA-00279: change 7996788033403 generated at 06/10/2014 04:57:12 needed for thread 2
      ORA-00289: suggestion : +ARCH_DG
      ORA-00280: change 7996788033403 for thread 2 is in sequence #16085


      Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
      +RL_DG/msblprd/onlinelog/group_3.258.834497283
      ORA-00325: archived log for thread 2, wrong thread # 1 in header
      ORA-00334: archived log: '+RL_DG/msblprd/onlinelog/group_3.258.834497283'


      ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
      ORA-01194: file 1 needs more recovery to be consistent
      ORA-01110: data file 1: '+SBL_DATA/msblprd/datafile/system.260.803502221'

      SQL> RECOVER DATABASE USING BACKUP CONTROLFILE UNTIL CANCEL ;
      ORA-00279: change 7996788033403 generated at 06/10/2014 04:57:12 needed for thread 2
      ORA-00289: suggestion : +ARCH_DG
      ORA-00280: change 7996788033403 for thread 2 is in sequence #16085


      Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
      +RL_DG/msblprd/onlinelog/group_13.262.834498753
      ORA-00310: archived log contains sequence 16107; sequence 16085 required
      ORA-00334: archived log: '+RL_DG/msblprd/onlinelog/group_13.262.834498753'


      ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
      ORA-01194: file 1 needs more recovery to be consistent
      ORA-01110: data file 1: '+SBL_DATA/msblprd/datafile/system.260.803502221'

       

      -- I tried entering AUTO

      SQL> RECOVER DATABASE USING BACKUP CONTROLFILE UNTIL CANCEL ;
      ORA-00279: change 7996788033403 generated at 06/10/2014 04:57:12 needed for thread 2
      ORA-00289: suggestion : +ARCH_DG
      ORA-00280: change 7996788033403 for thread 2 is in sequence #16085


      Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
      AUTO
      ORA-00308: cannot open archived log '+ARCH_DG'
      ORA-17503: ksfdopn:2 Failed to open file +ARCH_DG
      ORA-15045: ASM file name '+ARCH_DG' is not in reference form


      The above attempts didn't work because oracle was looking for archived logs which were not present in ARCH_DG as RMAN has backed it up to tape and issued "delete input" .

      But I have few basic questions.

       

      Question1. Why did RECOVER DATABASE command fail with ORA-01610 ?

       

      Question2. Both RECOVER DATABASE and RECOVER DATABASE USING BACKUP CONTROLFILE UNTIL CANCEL commands will only look for archivelogs not Online Redo logs. Right ? I understand that database will try to do media recover by applying Online Redo Logs when the STARTUP OPEN command is issued. Right ?

       

      Question3. In the command RECOVER DATABASE USING BACKUP CONTROLFILE UNTIL CANCEL  which BACKUP CONTROLFILE are we referring to ?

       

      Question4. What does AUTO option do in the 'Specify log' prompt as I've shown above in red?

        • 1. Re: Few Questions on Manual Recovery
          Hemant K Chitale

          Apparently

          a.  Your database is at least a 2 node (instances) RAC database  (from the messages "needed for thread 2")  so you'd need the logs from all the instances

          b.  You need to check how the SAN level cloning of disk groups is done, are all the disk groups consistent, is the database put in BEGIN BACKUP mode ?

           

           

          Hemant K Chitale


          • 2. Re: Few Questions on Manual Recovery
            Kavanagh

            Thanks Hemant.


            No the DB was not put in BEGIN BACKUP mode. We never needed it. It has been working fine without BEGIN BACKUP mode.

             

            Source DB was a 4-Node RAC DB and the Target DB (clone DB) is a 2-node RAC DB. But archive logs from all 4 instances are available in ARCH_DG which is a shared storage. ARCH_DG has been cloned at SAN level. I think when the SAN Cloning is underway, in production, RMAN might have backed up the archivedlog and deleted it due to the "delete input" clause .

            • 3. Re: Few Questions on Manual Recovery
              Hemant K Chitale

              imho you should put the database in BEGIN BACKUP mode when you create the clone.

              Check with EMC on their documentation for Oracle.

              Check Oracle Support Note "Supported Backup, Restore and Recovery Operations using Third Party Snapshot Technologies (Doc ID 604683.1)"

               

               

              Hemant K Chitale

               


              • 4. Re: Few Questions on Manual Recovery
                Kavanagh

                Any volunteers for answering my questions 1,2,3,4 ?

                • 5. Re: Few Questions on Manual Recovery
                  Hemant K Chitale

                  Question1. Why did RECOVER DATABASE command fail with ORA-01610 ?

                  Oracle detected that the controlfile was "older" (e.g. had a lower checkpoint#) than the database.  So it recognised the controlfile as a BACKUP CONTROLFILE.  In a normal RECOVER DATABASE you are doing a Complete Recovery which can be done only if the real current controlfile and redo log files are consistent and "newer" than the database files.

                   

                  Question2. Both RECOVER DATABASE and RECOVER DATABASE USING BACKUP CONTROLFILE UNTIL CANCEL commands will only look for archivelogs not Online Redo logs. Right ? I understand that database will try to do media recover by applying Online Redo Logs when the STARTUP OPEN command is issued. Right ?

                  No. No.  (in the normal case) The RECOVER DATABASE can automatically identify that the online redo log file needs to be applied to do a Complete Recovery.  In your case, Oracle was already attempting an Incomplete Recovery.  It would have to be supplied (or be able to AUTO-generate) the archivelog file names.  Applying the Online Redo Log files happens only in an Instance Recovery -- a STARTUP after a crash or abort.  Your case was a Media Recovery (again because of the Backup control file).

                   

                  Question3. In the command RECOVER DATABASE USING BACKUP CONTROLFILE UNTIL CANCEL  which BACKUP CONTROLFILE are we referring to ?

                  The controlfile that you have on disk is "older" -- it is recognised as a Backup controlfile.

                   

                  Question4. What does AUTO option do in the 'Specify log' prompt as I've shown above in red?

                  (in the normal case) Oracle can generate the "next" archivelog file name based on the last applied archivelog and the log_archive_dest and log_archive_format.  Since you hadn't applied any archivelog yet, it wouldn't be able to generate the next expected archivelog --- this also happens in RAC recovery where it can auto-generate the next archivelog file name for (say) Thread#1 because the archivelog for Thread#1 has been applied but can't yet auto-generate for Thread#2 until one archivelog has been applied.

                   

                   

                  You will also notice the sever mismatch between the online redo log header and the expected log

                  >archived log contains sequence 16107; sequence 16085 required

                  This means that the datafiles is much older than the onlne redo log. Since your listing doesn't show 22 redo log groups, it is most likely that the snapshot of the redo log DG and the database files DG are not consistent --- the second is much older than the first.  You need an archivelog to be applied, not the online redo log.

                   

                  Your query on V$LOG and V$LOGFILE should always inclue THREAD# (when running RAC) and SEQUENCE# (each Thread has it's own Sequence)

                   

                  You may also check your alert log. It might indicate that Oracle was attempting to do recover Fuzzy datafiles -- that would happen if your snapshot was taken without the database in BACKUP mode.

                   

                  Talk to EMC and Oracle support for the correct advice on how to do a Snapshot backup of an Oracle database.  Else, you'll be toast when you have to do a Real Production Database Recovery.  I suspect that your DGs and EMC volumes are not setup properly to create synchronised snapshot backups.

                   

                   

                  Hemant K Chitale

                   

                  Message was edited by: Hemant K Chitale Added sentence " I suspect that your DGs and EMC volumes are not setup properly to create synchronised snapshot backups."

                  • 6. Re: Few Questions on Manual Recovery
                    Kavanagh

                    THANK YOU VERY MUCH Hemant. You are one helluva professional volunteer !!