This content has been marked as final. Show 8 replies
It sounds like your Oracle Scheduler system may be shut down.
If this is your private db then you can try out the following operations to investigate.
Otherwise check with your dba.
Question: How can I shut down the scheduler?
ALTER SYSTEM SET JOB_QUEUE_PROCESSES = 0;
The following query confirms that the job is scheduled:
select * from DBA_SCHEDULER_JOBS;
Then turn it back on:
ALTER SYSTEM SET JOB_QUEUE_PROCESSES = 100;
To see what the current setting is run the following as sys:
select * from v$parameter where name='job_queue_processes';
If the value is set to 1, then only one workflow would be able to run at a time.
My job queue processes is set to 1000 so that does not seem to be the issue. Right now there are 20 jobs in the scheduler and all but my ODM ones seem to be fine. All of my ODM jobs are showing a status of "Chain Stalled". I can't seem to figure out why or where the chain is stalled or how to get it going again. Thanks again for any help you can provide.
Having a chain stalled condition is not something we would expect to see.
See snippet below regarding Oracle Scheduler and CHAIN_STALLED status.
Was this ODMr Installation ever working on this db?
Can you run even a single node workflow, such as one with a data source node?
Have you tried to cancel/stop any of the stalled workflows?
We can provide some queries to better understand what is triggering the stalled state if that is something you are in a position to do.
At the completion of a step, the chain rules are always evaluated to determine the next steps to run. If none of the rules cause another step to start, none cause the chain to end, and the evaluation_interval for the chain is NULL, the chain enters the stalled state. When a chain is stalled, no steps are running, no steps are scheduled to run (after waiting a designated time interval), and no event steps are waiting for an event. The chain can make no further progress unless you manually intervene. In this case, the state of the job that is running the chain is set to CHAIN_STALLED. (However, the job is still listed in the SCHEDULERRUNNING_JOBS views.)
You can troubleshoot a stalled chain with the views ALL_SCHEDULER_RUNNING_CHAINS, which shows the state of all steps in the chain (including any nested chains), and ALL_SCHEDULER_CHAIN_RULES, which contains all the chain rules.
You can enable the chain to continue by altering the state of one of its steps with the ALTER_RUNNING_CHAIN procedure. For example, if step 11 is waiting for step 9 to succeed before it can start, and if it makes sense to do so, you can set the state of step 9 to 'SUCCEEDED'.
Alternatively, if one or more rules are incorrect, you can use the DEFINE_CHAIN_RULE procedure to replace them (using the same rule names), or to create new rules. The new and updated rules apply to the running chain and all future chain runs. After adding or updating rules, you must run EVALUATE_RUNNING_CHAIN on the stalled chain job to trigger any required actions.
This is our first time trying to get ODM to work on this node, so no it was not working before. No workflows run, even a single data source node. All workflows get put into a scheduled state of chain stalled. I was able to force the "WF_START" step in the workflow to SUCCEEDED and that also did not good. It is now just waiting at step 10001 which is the data source. Very strange. My other system is working fine, but it is on oracle 184.108.40.206 and sqldeveloper 3.2.
Well, maybe lets try a "do over".
Can you use the latest: SQL Developer 3.2.2 RTM Version 3.2.20.09 Build MAIN-09.87
I think you said SQL Dev 3.2 works fine on your other db, so this may help.
Now in your response you mentioned oracle 220.127.116.11, but I suspect that is a typo and you meant 18.104.22.168.
Using SQL Developer 3.2.2
1) Drop the repository on bad node.
Via UI, go to Tools->DataMiner->Drop Repository.
2) Now install the new Repository.
You can either use the UI or use the scripts.
Save the log.
3) Retry creating and running a one node workflow.
Lastly, any significant configuration differences between the 2 dbs?
OK, tried to put 3.2 on the server and still can't get anything going. When I try to either upgrade or drop the repository I get an error message saying workflows are either queued or running and it cannot take any further action. From the DB Connections tab I dropped all nodes under the Scheduler->Jobs and Scheduler->Chains nodes and I still get the error.
The ODMr Repository keeps its own state on workflows as well.
Run the following query under sys:
This will come up with a value of 0 if there are no jobs running or in queue.
SELECT COUNT(*) WORKFLOWS_NOT_INACTIVE FROM "ODMRSYS"."ODMR$WORKFLOWS" WHERE status NOT IN ('INACTIVE','STOPPED','STOPPING');
Just to validate what the final workflow job status was, run the following query and let us know what the STATUS value was:
If there were workflows that were stuck in queued state, and they could not be canceled via Data Miner, then the workflow records in the ODMr repository will need to be manually updated prior to dropping the repository.
select * from "ODMRSYS"."ODMR$WORKFLOWS";
Run the following update as sys:
After that, you should be able to drop the repository and reinstall.
UPDATE "ODMRSYS"."ODMR$WORKFLOWS" SET STATUS = 'INACTIVE'; COMMIT;