6 Replies Latest reply: Dec 11, 2012 1:09 PM by 961716 RSS

    How to kill an RMAN job?

    961716
      I'm running 11.2.0.2 on Solaris and want to know what is the proper way to kill an RMAN backup job that has already started?

      This sounds like something as simple as: ps -ef |grep rman, then, kill -9 both rman processes. Right? Wrong!
      $ /home/oracle>ps -ef |grep rman
        oracle 27493 10700   0 14:08:44 pts/1       0:00 /bin/ksh rman_bkup_db_lvl0.ksh MYDB
        oracle 27524 11155   0 14:09:00 pts/3       0:00 grep rman
        oracle 27508 27493   0 14:08:44 pts/1       0:01 rman target /
      
      $ /home/oracle>kill -9 27493 27508
      
      $ /home/oracle>ps -ef |grep rman
        oracle 27552 11155   0 14:09:56 pts/3       0:00 grep rman
      Rman process seems to be killed.
      But, looking in the database, the process is still running.
      SQL> select username, program from v$session where username = 'SYS';
      
      USERNAME                 PROGRAM
      ------------------------ ------------------------------------------------
      SYS                      rman@myhost (TNS V1-V3)
      SYS                      rman@myhost(TNS V1-V3)
      SYS                      OMS
      SYS                      sqlplus@myhost (TNS V1-V3)
      SYS                      rman@myhost (TNS V1-V3)
      
      (and several others from previous runs that I killed off at the OS level)
      When I looked in OEM, I could see a spike in user I/O and sys I/O and the processes were the RMAN processes.

      I found one web page that shows if I select from and join V$SESSION_WAIT, V$SESSION, and V$PROCESS
      WHERE s.client_info LIKE 'rman%', I can get the SID and Serial# to kill the sessions within the database.

      Is there an easier way to do it than this? In my case, I ended up using Toad and killed all the RMAN processes, but then I hit bug 11872103 which left status showing RUNNING and which causes a performance problem when resync'ing the catalog.
        • 1. Re: How to kill an RMAN job?
          Levi Pereira
          Hi,

          Follow documentation about this issue:

          Terminating an RMAN Command
          http://docs.oracle.com/cd/E11882_01/backup.112/e10642/rcmtroub.htm#BRADV176

          Any doubt post here.

          RMAN utility is only a Client that open session on Oracle Server. RMAN does not create or read datafiles/backupset but Oracle Server does...RMAN will manage and control it.
          If you kill only RMAN Client process, it's not garantee which session on Oracle Server will be killed too. So, if you want stop RMAN job you must kill the session not RMAN Client.

          Regards,
          Levi Pereira

          Edited by: Levi Pereira on Dec 11, 2012 2:26 PM
          • 2. Re: How to kill an RMAN job?
            mseberg
            Hello;

            I believe the kill -9 ( if you really have to ) works. The "Why" it still shows is an interesting question. Since you are on 11R2 you have two more views which may help :

            V$PROCESS_GROUP
            V$DETACHED_SESSION

            I do not have a good example for either. But this might help :

            Bug 5453737 WHEN A SESSION IS KILLED, PADDR CHANGES IN V$SESSION BUT ADDR NOT IN V$PROCESS

            How To Find The Process Identifier (pid, spid) After The Corresponding Session Is Killed? [ID 387077.1]

            Best Regards

            mseberg

            Edited by: mseberg on Dec 11, 2012 8:02 AM
            • 3. Re: How to kill an RMAN job?
              Victor Armbrust
              I usually use the "kill -9" action, however it is not killed from the DB in the same moment. I think the recomendation from Levi Pereira is the best option
              • 4. Re: How to kill an RMAN job?
                961716
                Well, it appears Oracle wants us to handle it by using the 'sid_in_rman_output'.

                We have to do this for each channel we are allocating (or if using parallelism) by getting this information from the rman log file.

                Example:

                RMAN> 2> 3> 4> 5> 6> 7> 8> 9> 10> 11> 12> 13> 14> 15> 16> 17> 18> 19> 20> 21> 22> 23> 24> 25> 26> 27>
                allocated channel: d1
                channel d1: SID=131 device type=DISK

                allocated channel: d2
                channel d2: SID=204 device type=DISK

                allocated channel: d3
                channel d3: SID=13 device type=DISK

                allocated channel: d4
                channel d4: SID=69 device type=DISK

                Starting backup at 11-DEC-12
                channel d1: starting compressed incremental level 0 datafile backup set

                Then, we select sid, serial# from v$session where sid in ('..blah,blah,blah') and kill them one-by-one in SQL*Plus.

                But what i don't like is: This statement has no effect on the session if the session stopped in media manager code.

                And, then there is the bug that mseberg mentioned: 5453737 WHEN A SESSION IS KILLED, PADDR CHANGES IN V$SESSION BUT ADDR NOT IN V$PROCESS
                • 5. Re: How to kill an RMAN job?
                  Levi Pereira
                  Then, we select sid, serial# from v$session where sid in ('..blah,blah,blah') and kill them one-by-one in SQL*Plus.

                  But what i don't like is: This statement has no effect on the session if the session stopped in media manager code.
                  As I mentioned by editing previous post.

                  I never kill RMAN utilty to stop Oracle RMAN Job, we must always stop session on server side.

                  It's the same when we use SQLPLUS to call/execute a procedure, if we kill SQLPLUS utility the procedure and session will keep running on server side.

                  SQLPLUS and RMAN utility is only a client.

                  About bug 5453737 :
                  Status: Closed, Not a Bug

                  When I go kill a session I never kill it using "alter system kill immediate". I use kill -9 with caution. ( I don't recommend others do it for all case, but I do because I know what I'm doing)

                  I always do this job (Stop Rman job) this way (simple and easy)
                  ### Start backup
                  
                  RMAN> backup database plus archivelog;
                  Starting backup at 11-DEC-12
                  current log archived
                  using target database control file instead of recovery catalog
                  allocated channel: ORA_SBT_TAPE_1
                  channel ORA_SBT_TAPE_1: SID=478 device type=SBT_TAPE
                  channel ORA_SBT_TAPE_1: Data Protection for Oracle: version 5.5.1.0
                  channel ORA_SBT_TAPE_1: starting archived log backup set
                  channel ORA_SBT_TAPE_1: specifying archived log(s) in backup set
                  
                  
                  
                  ### Connect on Oracle node Server as oracle user:
                  
                  $ sqlplus / as sysdba 
                  
                  ### Check what Session will be killed
                  
                  select s.sid, s.serial#, s.username,
                         to_char(s.logon_time,'DD-MON HH24:MI:SS') logon_time,
                         p.pid oraclepid, p.spid "ServerPID", s.process "ClientPID",
                         s.program clientprogram, s.module, s.machine, s.osuser,
                         s.status, s.last_call_et
                  from  v$session s, v$process p
                  where s.paddr=p.addr
                  and s.program like 'rman%'
                  order by s.sid
                  
                  ### Generate simple command
                  select '! kill -9 '||p.spid  kill_rman_process
                  from  v$session s, v$process p
                  where s.paddr=p.addr
                  and s.program like 'rman%'
                  order by s.sid
                  
                  
                  KILL_RMAN_PROCESS
                  ----------------------------------
                  ! kill -9 2516
                  ! kill -9 2517
                  ! kill -9 2497
                  
                  ### Just copy and paste
                  SQL> ! kill -9 2516
                  SQL> ! kill -9 2517
                  SQL> ! kill -9 2497
                  
                  ### and result is: 
                  
                  
                  channel ORA_SBT_TAPE_1: starting piece 1 at 11-DEC-12
                  ORACLE error from target database:
                  ORA-03114: not connected to ORACLE
                  
                  RMAN-00571: ===========================================================
                  RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
                  RMAN-00571: ===========================================================
                  ORA-03114: not connected to ORACLE
                  RMAN-03002: failure of backup plus archivelog command at 12/11/2012 14:55:06
                  ORA-03114: not connected to ORACLE
                  ORA-03135: connection lost contact
                  
                  RMAN>
                  I even touch on RMAN Client. This way I know that job was killed because I can see output on RMAN Client.

                  Hope this helps,
                  Levi Pereira
                  • 6. Re: How to kill an RMAN job?
                    961716
                    Thanks Levi, that's a sweet little query to build the kill statements.
                    In my case, I only killed the OS processes that included 'rman' in the process name (ps -ef | grep rman).
                    I'm assuming if I did it your way, it would get processes that would not have been picked up the way I did it.