How to migrate an Agent-only 12c instance using BAR file
This concerns 12.2.1.4.0 on Oracle Linux Server 6.9.
We have an instance which is dedicated to running agents. There are nearly one thousand running agents on this instance, with non-trivial integration into other parts of our infrastructure.
Our problem is that we do not seem to have a painless way to migrate this instance using an out-of-place upgrade process. When importing the BAR file from the old instance to the new, OOP instance, we get RPD, webcat, and security just fine, however *_BIPLATFORM.S_NQ_JOB is empty and none of the Agents are scheduled. The brute-force solution, which we have done once and have no desire to ever do again, is to go re-save every single Agent so it gets re-registered in the jobs table. Unfortunately, due to our integration points we use command-line job execution for certain business needs and these rely on Job ID, which of course changes if we manually re-register all jobs in the job table. So after spending dozens of person-hours touching all Agents in the webcat, we then get to spend another few person-hours collecting new Job IDs and updating scripts.
I have searched this forum and the wider Googleverse and I have not found anybody specifically discussing how to bring up Agents after a BAR-file migration. I am a bit surprised that the BAR file import does not at least automatically re-register all active Agents in the webcat into S_NQ_JOB. This at least would keep them all running, though we would still have to collect the new Job IDs. It would save us about 75% of the work we have to do otherwise.
Our next attempt is going to be to try to lift S_NQ_JOB and S_NQ_JOB_PARAM out of the old instance's schema and drop it into the new schema. Webcat paths are obviously going to be the same so we're hopeful that obisch will be happy and keep on humming along.
And but so, my question for the forum is if someone has found a good process to do an OOP migration/upgrade while preserving Agents - where they run as scheduled without additional work, and preferably with the same Job IDs.
Thanks in advance.
Answers
-
The status of the agent (active or disabled) is missing in any migration approach so far.
That was the origin of https://gianniceresa.com/2017/03/scripts-to-use-agents-in-obiee-12c-mass-enabling-and-execution-from-a-page/ , a piece of javascript which allows to re-enable agents all at once.
The link between the schedule tables and the catalog objects is quite weak, therefor the "clean" way Oracle suggest is to re-enable agents (the schedule being saved inside the agent XML it just require a click).
You could also script all that if you want (https://gianniceresa.com/2017/03/obiee-12c-agents-enable-them-by-code/ has pieces of where to look for and how things works), but my conclusion was that for a migration (a one-shot thing), the safer was to stick with the official way and just take a shortcut by enabling them all at once.
0 -
That's precisely the problem. Something can have been scheduled but is now set to inactive. Just going through everything and build-enabling can be dangerous. It's really handled very poorly in any case right now-
0 -
Alright thanks. I will read your writeups Gianni. We thought about ways to script/automate the re-registering of Agents, which would at least save us some time. We have done the "clean/recommended" way before manually - 3 developers working for basically an entire day to go click on every agent in the system. Not fun. And after that we have to update all of the scripts which reference the (old) Job IDs.
The BAR extraction process should read S_NQ_JOB and S_NQ_JOB_PARAM and include an extract in the BAR. The import function, likewise, should populate the target S_NQ_JOB tables (with an option to skip this, such as 'noagents'). Failing that, the active/inactive status of an agent is stored in the XML so the BAR import could at least auto-re-register all active Agents in the webcat (again, with an option to disable this step).
I still want to play around with just dropping S_NQ_JOB/PARAM from source into target schemas and see what happens. As far as I can tell, scheduler knows where each job lives in the webcat by one of the entries in the PARAM table. There does not appear to be anything, how do I say it..."special" in those tables with the exception of the Job ID. There is no insert trigger and no sequence in the BIPLATFORM schema, so the IDs must be created by the bi or scheduler process when an agent is saved. If it does something like "select max(job_id)+1" from S_NQ_JOB and uses the result for the newly-saved job, perhaps it could survive a drop-in copy of S_NQ_JOB from another instance. If instead there is some internal sequence - reset to 0 or 1 with a fresh instance install - that is used to assign Job IDs, then I'd expect Job ID collisions after plopping down a bunch of S_NQ_JOB rows that scheduler didn't create itself.
Before that, I need to read Gianni's posts. Probably I'll be heading down a blind alley...
I'm a little surprised that there isn't more alarm and hand-wringing about this obvious and (I would think) broadly encountered problem. It's a glaring hole in the BAR file migration process.
0 -
True that you could imagine agents to be supported by the BAR, I imagine Oracle will be more than happy to receive this ER (and they will also probably not care much for a while if not having lot of related ER).
As it's 2 years old almost I don't have all the details in mind: in your case a copy/paste of the whole content of the JOBs tables could work in theory. There isn't anything, as far as I know, describing where the sequence of Job ID is saved or how it is managed and if the scheduler has some other references somewhere than just the content of the table. As you say from the tables to the agent only the path is the link in theory. The other way round is managed by the ID saved inside the agent itself.
As your target is a empty system there are chances it could work, but I would do it only once, I wouldn't start playing with it multiple times. (the lack of formal description of the links between pieces is a bit scaring as it really seems to be a very weak link).
I would definitely do it as a test by entering fake SMTP settings to avoid email to be sent around in case of issues, but if you have 20 minutes ... you can give it a try. For sure it's shorter than re-enabling agents if it fails.
0 -
Downloaded and installed the .catalog file for the Agent Manager page. It's capital-A Awesome. Thanks. I'm going to put in a "Disable" button and then play with adding a filter to the list for our business units - show me only Finance agents, only M&S agents, and so on.
(ETA 20 minutes after writing that: eesh, I am a total jquery n00b. I guess now is as good a time as any to learn.)
This will, at least, cut our future workload down by 60-70% when we migrate. If my wholesale S_NQ_JOB replacement idea doesn't pan out, this Javascript wizardry has already given us a vast improvement.
Now for SEO - I searched Google for a few combinations of "OBIEE 12c", "migration", "BAR file", "agents", "S_NQ_JOB", "inactive" and so on, and your page did not come up! Thow a <meta> tag in there!
0 -
Worth nothing that the default "search results" limit applies, in the comment of the blog post there is a reference to the setting for your OBIPS to get more than 300 agents (if you have more than that).
For me the page come up as first in Google for "obiee 12c agents enable", but maybe Google knows me too much and put my own things first in the results
0 -
Yes, if I use that string it does come up. I was approaching my search more from the standpoint of "hey this BAR file migration didn't work" so my terms weren't quite right.
Where does one go to learn, for example, all of the parameters available for the CatalogTreeModel call? I can tell what it is doing in your script, but I can't find documentation for it, or even blog posts besides yours and one from RittmanMead about granular OBIPS logging. Without your example to guide me, how would I find out that there is a "mask", a "recurse", a "sig" option (and which "sigs" are valid)? I've got no problems diving into the /bin directory of the install; I just have no idea how to get started.
0 -
Forget documentation and any other form of official document (wouldn't be fun otherwise).
(I'm not responsible for any damage you could do to your OBIEE install or any infringement of accepted license terms etc.)
You can get all those things by using, a lot, your developer tools in your browser (or the same thing with a similar name depending on the browser). Then you open the webpage with the catalog search and you start tracking down javascript calls and events. When you find a call of a function with a nice or intriguing name you search in the loaded JS files of your browser where it comes from. Often the JS libraries of OBIEE are compacted and there the Developer toolbar can also help as it make it readable again. And piece by piece you find what is called, from where, passing which parameters etc.
I guess I was editing the JS files on disk to add debugging output in my browser console (using "console.log") to track where things were going and all the variables passed etc.
After a bit there inside, assuming you like javascript, things start making more and more sense and you find similarities with the webservices existing etc.
It's a very empirical approach.
0 -
Alright thanks. I'm very familiar with the Dev Tools; I have solved other problems using the method you describe. I was more wondering, for example, with the CatalogTreeModel call for saw.dll - this is not Javascript, correct? This is something buried in the bi binaries. So by using the Dev Tools I can see that "saw.dll?CatalogTreeModel" gets called and also see that "sig=coibot1" is sent as a parameter. Hey that's neat maybe that lets me get a list of iBots/Agents.
But if for example there was another valid value for sig - "cofoobar" - that I never see by poking around in the Dev Tools. How would one know such a thing exists, if one is not on the OBIEE dev team? Maybe the answer is "you can't know", but I figure if there is a way, you might know of it. Maybe it's a matter of diving into analytics.jar, which really could break stuff. That's why we have a sandbox server.
0 -
Ok, for that you have 2 possible ways.
One is to look for what the UI expose as possible parameters and assume that everything is there. The "coibot1" for example is the signature for the catalog files representing agents, I crossed it while navigating the catalog by web services calls and doing things in that way. From there you can assume that any other signature of object is a valid value for that parameter.
Let's be honest, you will never know exactly what is what and what is possible in this way, you need to trust the fact that the UI expose almost everything which is possible.
If you want to go the other way round, as you suggest, you could look into files containing the source code of the tool.
Let's make the hypothesis I maybe looked there once, just an hypothesis as we all know it's against the license terms and therefore I never did it ...
The issue is that soon the Java stops and you end up with C++ things, which is just impossible to decompile (ok, it's possible but honestly I have the impression that chinese would be easier to learn) and find out more. Therefore this approach isn't worth the investment of time as it's very disappointing.
Again, if Oracle is listening, this is just an hypothesis and I never did that a single time
0