Forum Stats

  • 3,759,874 Users
  • 2,251,609 Discussions
  • 7,870,852 Comments

Discussions

Is it possible to automate RMAN Recovery with a bash script?

User_W2V5A
User_W2V5A Member Posts: 5 Green Ribbon

Hi Everyone!

I am attempting to automate our Disaster Recovery plan as much as possible in an attempt to reduce our RTO/RPO.

One of the components of our DRP is restoring the oracle database using RMAN. Unfortunately my knowledge of database recovery is limited and hence I have made a test environment to experiment.

I am given a flash_recovery_area which contains dummy data and an instruction manual on how it is usually done manually by the database team.

With that, I constructed my first simple script:

LogFile="/home/oracle/rmanlog.log"

LogFile2="/home/oracle/rmantimelog.log"

SECONDS=0

rman target= / nocatalog LOG=$LogFile << EOF

run

{

STARTUP FORCE NOMOUNT;

RESTORE CONTROLFILE FROM AUTOBACKUP;

ALTER DATABASE MOUNT;

configure device type 'SBT_TAPE' clear;

CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE SBT_TAPE clear;

CONFIGURE CHANNEL DEVICE TYPE 'SBT_TAPE' clear;

RESTORE DATABASE;

RECOVER DATABASE;

ALTER DATABASE OPEN RESETLOGS;

}

EOF

ELAPSED="Elapsed: $(($SECONDS / 3600))hrs $((($SECONDS / 60) % 60))min $(($SECONDS % 60))sec"

echo $ELAPSED > /home/oracle/rmantimelog.log

The problem with this script is that it works 4 or 5 times out of 10 tries and the time it doesn't work it gives the following error.

RMAN-06054: media recovery requesting unknown archived log for thread 1 with sequence 250124 and starting SCN of 19792842233

Obviously having a DRP that only works 4 or 5 times out of 10 isn't something anybody would want, so I tried another script which attempts to grab the SCN number and restore it again after the first try.

LogFile="/home/oracle/rmanlog.log"

LogFile2="/home/oracle/rmanlog2.log"

rman target= / nocatalog LOG=$LogFile << EOF

run

{

STARTUP FORCE NOMOUNT;

RESTORE CONTROLFILE FROM AUTOBACKUP;

ALTER DATABASE MOUNT;

configure device type 'SBT_TAPE' clear;

CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE SBT_TAPE clear;

CONFIGURE CHANNEL DEVICE TYPE 'SBT_TAPE' clear;

RESTORE DATABASE;

RECOVER DATABASE;

}

EOF


if grep -q SCN "$LogFile"; then

 SCN=$(grep -w "SCN" rmanlog.log | awk '{print $18}')

 echo $SCN

 rman target= / nocatalog LOG=$LogFile2 << EOF

 run

 {

 STARTUP FORCE NOMOUNT;

 RESTORE CONTROLFILE FROM AUTOBACKUP;

 ALTER DATABASE MOUNT;

 configure device type 'SBT_TAPE' clear;

 CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE SBT_TAPE clear;

 CONFIGURE CHANNEL DEVICE TYPE 'SBT_TAPE' clear;

 RESTORE DATABASE;

 RECOVER DATABASE UNTIL SCN $SCN;

 ALTER DATABASE OPEN RESETLOGS;

 }

EOF

fi

However, I am told that the SCN changes every time rman is run and I am not too sure if I am even able to pass a bash variable into the RMAN running environment as the success rate of this script is the same as the first one.

I am hoping that anybody here can shed some light on whether or not this is possible as well as whether if this is even a good idea at all.

Thanks!

Tagged:

Best Answer

  • EdStevens
    EdStevens Member Posts: 28,484 Gold Crown
    edited Jul 16, 2021 12:44PM Accepted Answer

    however as disaster recovery scenarios are random

    Exactly. So, how do you plan to script a random scenario?

    If you are not the DBA, exactly what is your day-to-day role?

    To your original question, yes it is possible to script an rman recovery. If you know which backup location you will be using - which one survived whatever random disaster. If you know where you will be restoring to - what servers survived the disaster. If the server to be recovered has - post disaster - already been built with the same OS version and patch level and Oracle installed to the same version and patch level.

    User_W2V5A

Answers

  • EdStevens
    EdStevens Member Posts: 28,484 Gold Crown

    I am given a flash_recovery_area which contains dummy data and an instruction manual on how it is usually done manually by the database team.

    My first question is, why isn't "the database team" doing this? If I were a director and had a database loss, I'd want my best DBA handing the recovery, not someone who by their own admission "Unfortunately my knowledge of database recovery is limited"

    And my first comment is that, in general, disaster recovery cannot be simply scripted. It all depends on exactly what has been lost and what is available for recovery. A single data file lost or corrupted? Multiple disks lost? OS corrupted? Fire, resulting in total loss of data center? What is normal backup scheme? What backups are available at any given moment? Do you have both local and off-site storage of backups? How far offsite is the offsite? (Consider natural disasters that could take out both locations.)

  • User_W2V5A
    User_W2V5A Member Posts: 5 Green Ribbon

    Thank you for your reply! Let first me address the questions you may have

    "My first question is, why isn't "the database team" doing this? If I were a director and had a database loss, I'd want my best DBA handing the recovery, not someone who by their own admission "Unfortunately my knowledge of database recovery is limited"

    They are currently doing it manually and will likely continue to be doing it. I am trying to explore ways to improve the process as it seems to involve many people and lots of manual work, which increases the randomness of the RTO/RPO. IF it could be automated, RTO/RPOs I feel would be more consistent.

    We do have onsite and offsite backups, however as disaster recovery scenarios are random as it should be, I am exploring the possibility to restore it from an offsite backup to a server in another country automatically. I agree that it largely depends on the scenario, but I feel that some parts of it can be scripted (at least the less sensitive parts, also some parts remain the same regardless of scenarios) in advance so that the work of the recovery team isn't as tedious when the time comes. Of course, this requires a lot of experimentation and testing to ensure that scripts are working as intended all the time.

  • EdStevens
    EdStevens Member Posts: 28,484 Gold Crown
    edited Jul 16, 2021 12:44PM Accepted Answer

    however as disaster recovery scenarios are random

    Exactly. So, how do you plan to script a random scenario?

    If you are not the DBA, exactly what is your day-to-day role?

    To your original question, yes it is possible to script an rman recovery. If you know which backup location you will be using - which one survived whatever random disaster. If you know where you will be restoring to - what servers survived the disaster. If the server to be recovered has - post disaster - already been built with the same OS version and patch level and Oracle installed to the same version and patch level.

    User_W2V5A
  • User_W2V5A
    User_W2V5A Member Posts: 5 Green Ribbon

    "Exactly. So, how do you plan to script a random scenario?"

    Currently I am using a fixed hypothetical scenario to test if automation is even possible.

    "If you are not the DBA, exactly what is your day-to-day role?"

    I guess the best way to describe my role is I am a sysadmin experimenting and looking for better ways to do things if possible.

    "To your original question, yes it is possible to script an rman recovery. If you know which backup location you will be using - which one survived whatever random disaster. If you know where you will be restoring to - what servers survived the disaster. If the server to be recovered has - post disaster - already been built with the same OS version and patch level and Oracle installed to the same version and patch level."

    Great! That gives me confidence to keep trying then! Thank you for your time and advice!

  • EdStevens
    EdStevens Member Posts: 28,484 Gold Crown

    Then as a system admin, you should know that anything you can do at a command line can be scripted. It's just that you have to know in advance exactly what you'd need to do at the command line.

  • User_W2V5A
    User_W2V5A Member Posts: 5 Green Ribbon

    Yes, as described above I have scripted it to a certain extent. I am just unsure if bash variables can be passed into RMAN environment to be run or how that works. There seems to be limited literature on how bash interacts with RMAN if I want to pass certain variables and such, so I am just trying to experiment by trial and error.

    Also, there seems to be an SCN number that is required that apparently changes every run of RMAN? In that case I would have to find a way to get this number within the RMAN session itself. I am just worried I might break something since I am unfamiliar with how RMAN works or how it would react to scripting.

  • EdStevens
    EdStevens Member Posts: 28,484 Gold Crown

    As far as scripting goes, there's nothing magic about rman. It's just a command line utility. You'd pass variables to it just like any command line utility.

    export some_cmd_line_var='fubar'
    some_utility $some_cmd_line_var
    

    or with input redirection

    export some_var='fubar'
    some_utility <<EOF
    some_utility_command  $some_var
    EOF
    

    The SCN doesn't change with every run of rman, it changes with every transaction in the database.

    You really shouldn't be doing this without the help of the DBA. If he doesn't want to help, then you don't want to be doing this at all. If I were the DBA, I wouldn't even allow a non-DBA to be touching this project.

  • User_W2V5A
    User_W2V5A Member Posts: 5 Green Ribbon

    "You really shouldn't be doing this without the help of the DBA. If he doesn't want to help, then you don't want to be doing this at all. If I were the DBA, I wouldn't even allow a non-DBA to be touching this project."

    Hmm is there a reason for this? From my perspective its just a low impact side experiment that is mainly for my own learning with a small chance that it might improve current processes. I don't really see any downsides to doing this projects except spending a few extra hours learning new stuff everyday. I was thinking I'll get their help once there is some proof of concept as I'd figured it wouldn't be nice to bother very busy people with my side project.

    "The SCN doesn't change with every run of rman, it changes with every transaction in the database."

    So is it possible to run rman recovery once, pump the output to a log and retrieve the SCN number. Then run RMAN recovery again and pass the SCN number into the second iteration? Does the first rman recovery count as a "Transaction in the database"?

    Thanks for your help! Really appreciate it!

  • EdStevens
    EdStevens Member Posts: 28,484 Gold Crown

    Have you googled your error - RMAN-06054? If you are only getting this sporadically, then I'd say it is a problem with your testing methodology. We would need to know more about how your db is set up vis a vis archivelog mode, backup destination, archivelog destination, and backup scripts. And what you do to actually initiate a recovery test.

    Do you understand the concept of what is happening with RESTORE DATABASE and RECOVER DATABASE? And the difference between the two?

    You start of with STARTUP FORCE NOMOUNT. The first step of any STARTUP is to read the spfile (or pfile if no spfile is available). Have you taken into consideration of recovering your spfile from whatever recovery scenario you are simulating? Have you looked at the various recovery scenarios detailed in Database Backup and Recovery User's Guide?

    You say you are just working on a POC and don't want to bother a busy DBA. So at what point do you plan on starting to bother him? Database recovery is the DBA's Prime Directive.