4 Replies Latest reply: Dec 12, 2013 9:10 PM by Vignan RSS

    DBMS_SCHEDULER: JOBS failure on database shutdown

    Vignan

      Hi

      I am new to DBMS Scheduler.

      We are using dbms scheduler chain and job for executing procedures daily.

      But whenever there is database shutdown or crash, job immediately goes into CHAIN_STALLED state.

       

      Is there any way such that after the database restores the chain starts executing from the failed step without manual intervention.

       

      Thanks,

      Vignan

        • 1. Re: DBMS_SCHEDULER: JOBS failure on database shutdown
          spajdy

          Use dbms_scheduler.alter_chain and set attribute:

          Doc say:

          RESTART_ON_RECOVERY—If the RESTART_ON_RECOVERY attribute is set to TRUE for a step, then if the step is stopped by a database shutdown, it is restarted when the database is recovered. If this attribute is set to FALSE, then if the step is stopped by a database shutdown, the step is marked as stopped when the database is recovered and the chain continues.


          check this job atrribute too:

          restartable

          This attribute specifies whether a job can be restarted in case of failure. By default, jobs are not restartable and this attribute is set to FALSE. Setting this to TRUE means that if a job fails while running, it will be restarted from the beginning point of the job.

          In the case of a chain job, if this attribute is TRUE, the chain is restarted from the beginning after an application failure. If this attribute is FALSE, or if there has been a database failure, the chain is restarted at the last running step. Therestart_on_recovery attribute of that step then determines if the step is restarted or marked as stopped. (If marked as stopped, the chain evaluates rules and continues.)

          Note that setting this attribute to TRUE might lead to data inconsistencies in some situations, for example, if data is committed within a job.

          Retries on errors are not counted as regular runs. The run count or failure count is not incremented until the job succeeds or has failed all its six retries.

          The restartable attribute is used by the Scheduler to determine whether to retry the job not only on regular application errors, but after a database malfunction as well. The Scheduler will retry the job a maximum of six times. The first time, it will wait for one second and multiply this wait time with a factor of 10 each time thereafter.

          Both the run count and failure count are incremented by 1 if the job has failed all its six retries. If the job immediately succeeds, or it succeeds on one of its retries, run count is incremented by 1.

          The Scheduler will stop retrying a job when:

          • One of the retries succeeds.
          • All of its six retries have failed.
          • The next retry would occur after the next regularly scheduled run of the job.

          The Scheduler no longer retries the job if the next scheduled retry is past the next regularly scheduled run for repeating jobs.


          • 2. Re: DBMS_SCHEDULER: JOBS failure on database shutdown
            Vignan

            Thanks a lot for the reply.

            I am still left with a little bit of confusion.

             

            Now for example we have a scheduled job that runs daily at 4.00 AM. We a have a chain associated with it and it has 6 steps.

            Now if because of a database crash or shutdown the chain stopped at 4th step and the job goes to CHAIN_STALLED state.

            If we have RESTART_ON_RECOVERY attribute for the step is set to TRUE then will it restart automatically after the database is up?

            or do we also need to have the RESTARTABLE attribute for the JOB to be FALSE for the automatic restart to happen from that particular step.

            BTW is the retry duration incremental factor is fixed or can we modify it.

            Please provide any link which provides detailed information.

             

            Thanks in advance,

            Vignan

            • 3. Re: DBMS_SCHEDULER: JOBS failure on database shutdown
              spajdy

              1/ as  doc says:

              In the case of a chain job, if this attribute is TRUE, the chain is restarted from the beginning after an application failure. If this attribute is FALSE, or if there has been a database failure, the chain is restarted at the last running step. Therestart_on_recovery

              attribute of that step then determines if the step is restarted or marked as stopped. (If marked as stopped, the chain evaluates rules and continues.)


              So it's up to you how do you set atrributes.

              You know what's the bussines logic implementted in your job. You have to decide how job recover from failure including DB crash or shutdown.

              For chain job is more logical set job's attribute RESTARTABLE to FALSE and for each chain step set attribute RESTART_ON_RECOVERY to TRUE.

              With this settings when DB is started after crash then it automatically restart job's step which is runnig when DB crashed.


              I suppose that following steps are done during startup process of DB:

              1/ find all jobs with status RUNNING

              2/ if job has RESTARTABLE=TRUE then run job

              3/ if job has RESTARTABLE=FALSE and job is not CHAIN then change job state to STOPPED

              4/ if job has RESTARTABLE=FALSE and job is CHAIN and all job steps have RESTART_ON_RECOVERY=FALSE then change job state to STOPPED

              5/ if job has RESTARTABLE=FALSE and job is CHAIN and step that is RUNNING have RESTART_ON_RECOVERY=TRUE then start this step

              6/ if job has RESTARTABLE=FALSE and job is CHAIN and step that is RUNNING have RESTART_ON_RECOVERY=FALSE then change state of this step to STOPPED

              7/ evaluate rules for all CHAIN job found in 1/ minus jobs from 5/

              8/ check job that missed theirs runs and run them or schedule them to next run with respect to SCHEDULE_LIMIT attribute

               

              Chain job evalute it's rules and when no rule is evaluated as TRUE then job get stuck and it's state chage to CHAIN_STALLED. This is general behavior.

               

              All this is only my speculation how ORACLE works,I don't investigate this behavior so deeply.

               

              2/ What do you exactly mean by retry duration ?

              • 4. Re: DBMS_SCHEDULER: JOBS failure on database shutdown
                Vignan

                Thanks that clears my confusion.

                Regarding  retry duration I am talking about the wait time between scheduler retry.

                The Scheduler will retry the job a maximum of six times. The first time, it will wait for one second and multiply this wait time with a factor of 10 each time thereafter.

                 

                Thanks,

                Vignan.