Running Focused ODI Jobs on the Entire Ecosystem [Article]

Version 6

    This article by Hyperion architect and independent Consultant Dmitry Kryuk describes the process of running focused calculations with Oracle Data Integrator (ODI) and pushing changed data sets into ASO/BSO cubes across the environment.

    by Dmitry Kryuk




    This article describes the process of running focused calculations with Oracle Data Integrator (ODI) and pushing changed data sets into ASO/BSO cubes across the environment.

    Although Hyperion Planning and Calculation Manager allows the running of focused calculations by passing Runtime Prompts (RTPs) from the client side, it has some limitations:

    • Data movements between databases with different dimensionality are done with Xwrite/Xref functions, a comparatively much slower method than, say, DATAEXPORT/MDX  with subsequent data import using MaxL.
    • Built-in CDFs (like @CalcMgrMDXDataCopy and @CalcMgrExecuteEncryptMaxLFile) first appeared  in, and they also have limitations.
    • Even with @CalcMgrExecuteEncryptMaxLFile and the ability to execute MaxL from Business Rules (BRs), we are limited by the MaxL itself. We need to control concurrent users, keep unique names for data files, logs, etc.


    So what are we trying to achieve? Although the same method can be used in a wide range of scenarios, here we try to:

    • Allow the user to run BRs, and pass RTPs to ODI job
    • Perform focused calculations and distribute calculated changed data across the environment, including ASO cubes
    • Make data movements faster by using alternative methods and MaxL
    • Control multiple concurrent users and eliminate their collision
    • Notify users about completed calculations and provide a monitoring tool


    Figure 1


    Below is the list of steps and components that will help you achieve these goals. Keep in mind that the design presented below is one of many possibilities, and may be overkill for small environments. Also specific implementation would depends on business process, IT preferences about location of the codebase, and more. Here’s the list:


    • Parameterize your calculation scripts (the same way you would develop your Business Rules with CM variables).
    • Move your calculation scripts into MaxL (execute calculation ‘some tokenized script’ on <AppToken>.<DBToken>; ).
    • Build ODI procedure that uses tokenized MaxL script and replaces tokens.
    • Build ODI process that orchestrates the workflow, takes care of unique filenames, logs, timestamps, concurrent users, notifications, etc.
    • Build a custom-defined function (CDF) that essentially executes OS command on remote server (assuming your ODI is on a different server).
    • Build a trivial BR that passes values of the RTPs into CDF and launches ODI process.


    This should allow changed data flow in real time across your ecosystem. Now, let's dive into the specifics.

    Examples of Script Parameterization


    Tokenized Export

    The script below obviously performs DATAEXPORT. The exact settings are not important here--what is important is the use of FLEXOPTIONS. Basically, it is exactly the same concept as variables in Business Rules, but instead of writing {SomeVariable} we write <SomeVariable>.


    Why do we use <FlexOption1> and not, for example, some meaningful <Versions>? The answer: you definitely could, but I prefer to keep ODI objects at their minimum. I already have a procedure that takes a text file as an input, and replaces tokens <FlexOption1>  - <FlexOption20> with provided parameters. I could create a new procedure that uses <Versions>, <CostCenter>, <BusinessUnit>, <Timestamp>, etc., but I would end up having two identical procedures that I would need to manage. Here's the bottom line: it’s just a matter of personal preference and other factors like change and version control.


    execute calculation 
    /* ======================================================================== */
    /*=====Mapping of the FLEXOPTIONS========================================== */
    /*=====<FlexOption1>:Versions, type members =============================== */
    /*=====<FlexOption2>:CostCenter, type member. ============================= */
    /*=====<FlexOption3>:Entity, type member. ================================= */
    /*=====<FlexOption4>:BusinessUnit, type member. =========================== */
    /*=====<FlexOption5>:PCC, type member. ==================================== */
    /*=====<FlexOption9>:Job Timestamp. ======================================= */
    /*=====<FlexOption10>:Log path. =========================================== */
    /* ======================================================================== */
         DataExportColFormat ON;
         DataExportDynamicCalc OFF;
         DataExportOverwriteFile ON;
         DataExportDimHeader ON;
    FIX (&CurrYr, &NextYr,&Scenario, ..., "Jan":"Dec"
        ,<FlexOption1>    ,@RELATIVE(<FlexOption3>,0)
           DATAEXPORT "File" "," "<AppToken1>\<DBToken1>\SomeFile_<FlexOption9>.txt" "#Mi";  
    on <AppToken1>.<DBToken1>;




    One important parameter here is <FlexOption9>. As the mapping header states, this is a timestamp. It prevents the possibility of overwriting an export file by another instance of ODI job. If multiple users are running the job at the same time, each user will have their own export file. This also offers better troubleshooting. Timestamp is generated at the beginning of the ODI job, and is used in the scope of the same session.


    Tokenized Import

    When we need to load that unique file, we use the same token in the MAXL statement:

    import database "<AppToken1>"."<DBToken1>" data 
    from server text data_file 'SomeFile_<FlexOption9>.txt' 
    using server rules_file 'SomeRule' 
    on error write to "<FlexOption10>\\SomeFile_Error_<FlexOption9>.txt";


    Here again we use a timestamp token to create the log file.

    Tokenized Clear Region MDX

    Now, let's assume we want to push only the changed data into the ASO cube, and that we want to clear data in a region first, and then load a previously exported file. The clear statement would look something like this:

    alter database "<AppToken2>"."<DBToken2>" clear data in region 
                    CrossJoin(Descendants([<FlexOption2>],[CostCenter].levels(0)), Descendants([<FlexOption4>],[BusinessUnit].levels(0))) 
    ' physical;




    Tokenized ASO Import


    The following statement loads the ASO cube:


    alter database "<AppToken2>"."<DBToken2>" initialize load_buffer with buffer_id 1;
    import database "<AppToken2>"."<DBToken2>" data 
    from server text data_file "SomeExportFile_<FlexOption9>.txt" 
    using server rules_file "pushLoad" 
    to load_buffer with buffer_id 1 
    on error append to "<FlexOption10>\\SomeExportFile_Error_<FlexOption9>.txt";
    import database "<AppToken2>"."<DBToken2>" data 
    from load_buffer with buffer_id 1;




    Replacement Procedure

    In the previous section, we saw how to create a parameterized MaxL script that will run on a specific data subset and will update various applications from different corners of the Enterprise Performance Management (EPM) ecosystem.

    How do tokens from that script get replaced with RTPs? For now, let's assume the ODI scenario receives RTPs as an input parameter (we’ll discuss the exact method in the next section). ODI uses parameterized script as a template. It will replace all the tokens and will create a new script with replaced tokens and added timestamp, which will be executed. Replacement will be done via a procedure that  accepts RTP values as FlexOptions. Another important option is a “ReadFile”, which is essentially a parameterized MaxL template.


    Figure 2


    The actual replacement is done with Jython:



    import os
    import fnmatch
    import re
    import glob
    ### FileName option is passed as a parameter from ODI scenario.------------###
    ### This is the file into which MAXL script with replaced tokens will be---###
    ### written.---------------------------------------------------------------###
    file_name_option = '<%=odiRef.getOption("FileName")%>'
    file_name='%s' % file_name_option
    out_file = file_name
    ### Read the content of the tokenized ODI file-----------------------------###
    if os.path.exists(ReadFile):
      for line in open(ReadFile, 'r'):
    ### Replace tokens with RTPs ----------------------------------------------###
    ### Write the executable MAXL script --------------------------------------###



    Passing Business Rule RTPs to ODI

    So how exactly are RTPs passed into an ODI scenario? Say a user saves a smartview form, and a BR runs on save. The BR uses Calculation Manager variables, which are substituted with members from the form. What’s next? In order to pass those members into ODI we need to use a tiny custom defined function (CDF).


    Tiny CDF (Or how to launch a batch script from BR on remote server and take care of security)

    The idea behind this CDF is to execute a batch script, which will launch the ODI job. The batch file will accept parameters from RTPs, which will be passed to the ODI job in turn (see Figure 1, above).


    When you have ODI on a different server, use Windows as OS and run your Essbase service as ‘local system”. This is a pretty common use case, so let's review it. The problem is that the “local system” account does not have access to other servers, even if you use the PSExec utility. You could either run the batch as a different user, or provide a different user to PSExec, but in that case you’ll also need to provide the password.


    We probably don’t want to deal with password encryption and authentication between the servers in this CDF. As a workaround, we could do the following:


    Figure 3


    • Create Scheduled Task, which runs a batch file. Let's say that batch file is called FocusedODI.bat. Scheduled Task runs it as a user that has sufficient privileges on local and remote ODI servers. The Task also stores the password.
    • FocusedODI.bat contains a command that triggers an ODI job on a remote server. It would look something like this:

    PsExec.exe -accepteula -h \\hyp-app-wd3 D:\\Hyperion\\Automation\\LaunchScripts\\FOCUSEDODI.bat "Ver1" "VP_Sales" "B_100" "P_100" "E_100"

    • When CDF is called, two things happen:
    1. CDF overwrites the content of FocusedODI.bat. As you can see, the members passed into FOCUSEDODI.bat are hard-coded, so we would want our CDF to update the command with values of the RTPs, passed from the Business Rule.
    2. Once the new command is written into FocusedODI.bat, CDF will execute the scheduled task. Specifically it will execute the command:

                        cmd /c schtasks /Run /TN \Microsoft\TriggerODI

                    where Microsoft\TriggerODI is the name of the scheduled task.




    Calling The CDF

    The function (@JrunWinCommand) accepts a list of parameters:

    1. File into which the command is written. In our example, that is FocusedODI.bat, which will be executed by scheduled task.
    2. OS command that is executed by CDF. In our case, that is a command that triggers scheduled task.
    3. The first piece of the command, written into the file.
    4. All other parameters are enclosed with double quotes and prefixed with a space (“ ”).


    The third parameter is concatenated with the 4th to nth parameters (including double quotes and spaces), and that long string is written into the file specified in the first parameter. Below is the BR that calls the function:




    FIX (
    ### This is a fix on just any single cell that will never change ----------###
    ### We want @JrunWinCommand to run only once, and regardless --------------###
    ### of changing metadata. The actual RTPs are passed into  ----------------###
    ### @JrunWinCommand function  ---------------------------------------------###
    "FY16", "Q2_Outlook", "Ver1", "HSP_Inputvalue", "Local", "NoEntity", 
    "NoGradeGroup", "NoPCC", "NoCostCenter", "NoBusinessUnit", "BegBalance"
    ,"cmd /c schtasks /Run /TN \Microsoft\TriggerODI",
    "PsExec.exe -accepteula -h \\hyp-app-wd3 D:\\Hyperion\\Automation\\Scripts\\LaunchScripts\\Optimization\\SCENPLANBR.bat"





    Each time the BR is called, the content of D:\\Hyperion\\Automation\\scripts\\FocusedODI.bat is overwritten with the new RTPs, e.g.:


    PsExec.exe -accepteula -h \\xxxxxxxxxxxxxxxx D:\\Hyperion\\Automation\\Scripts\\LaunchScripts\\Optimization\\SCENPLANBR.bat "Working1,Working2,Working3" "VP_Sales" "AllBUs" "AllPCCs" "AllEntities"


    That command is triggered by the task scheduler. An important parameter of the scheduled task is “Run a new instance in parallel”. This will start multiple ODI jobs in parallel, the execution of which will be governed by the queue.



    Queueing The Jobs


    Some jobs or parts of jobs cannot run in parallel. For example, steps that clear data from the ASO cube before loading data cannot run concurrently on the same database. So, we need to implement a first in/first out (FIFO) queue, in which the user that triggered calculation first will be at the top of the queue. We do this by reading the HSP_JOB_STATUS table. From its name you can guess that this the table where all started jobs are written. In order not to mess with Planning repository objects, we only read that table. We create a replica of it (HSP_JOB_STATUS2), and insert new records from HSP_JOB_STATUS. This table is also used to create a dashboard, as we'll discuss later. Here are the steps for managing the queue:


    1. If it doesn't already exist, create a table called HSP_JOB_STATUS2. This table doesn’t have to be in the Planning repository--it's actually better to have it somewhere else (e.g., staging area, data mart, etc.). 
    2. Insert into HSP_JOB_STATUS2 records with JOB_NAME that corresponds to the current ODI package name (here’s the assumption that ODI job name = Business Rule name). Insert only those jobs that don’t already exist in HSP_JOB_STATUS2, and only recent jobs (say newer than ForgetInterval = 2 minutes). ForgetInterval helps us deal with situations when the ODI server crashed, or when we just deployed the process. Otherwise, the old junk will get into HSP_JOB_STATUS2. Here are the conditions:
    where 1=1
    and UPPER(JOB_NAME) like UPPER('<%=odiRef.getOption("BRName")%>')
    and (SYSTIMESTAMP - interval '<%=odiRef.getOption("ForgetInterval")%>' minute)<START_TIMEand JOB_ID not in (select JOB_ID from HSP_JOB_STATUS2)order by START_TIME desc
    1. Get the current JOB_ID. To do this, we need to get the oldest JOB_ID with the default status “started=1”. The next step will be to change the status to either pending (2) or processing (3). Assuming that step completed successfully for prior job instances, the oldest JOB_ID must be the current JOB_ID.
    2. Get the email/username with which the BR was triggered:
    1. Now we need to get the NEXT_STATE of the current job--i.e.,whether it should start or wait for its turn. If no other job is running, and among all other pending jobs the current one is the oldest, then change  #NEXT_STATE to 3 (Running).
    2. Update HSP_JOB_STATUS2 with the new status (value of #NEXT_STATE).
    3. If #NEXT_STATE is 3 (Running) then go ahead and start the main part of the ODI job (actual calculations, data movements, etc). Otherwise, wait a while and check #NEXT_STATE again. These steps are outlined in red rectangle in Figure 4, below:


    Figure 4

    The rest is just the logic of your business process. The two ovals in Figure 4 highlight preparing and running tokenized scripts. Procedure (the first step in each oval) replaces all the tokens (see replacement section, above). The second step in each oval executes the MaxL script created by the replacement procedure.





    Once focused ODI jobs are running, we need to provide users with the means to monitor their jobs. As mentioned earlier, some jobs cannot run concurrently for multiple users, specifically those steps that require data load or clearing regions of the ASO cube. In those cases, jobs are queued and executed sequentially, one after another.

    This could cause confusion for the user since completion time will depend on the number of jobs that were triggered by other users, and the scope of their execution. Hence, we need to give a reliable way to monitor those jobs. We implement a dashboard that presents the following details for every job executed in the past:

    • User
    • Job name
    • RTP parameters
    • Start time
    • End time
    • Duration for finished/failed jobs
    • Job status (pending, processing, completed failed)
    • Time graph -- execution time relative to other jobs.

    Below is a screenshot of such a dashboard:


    Figure 5

    From the screenshot, you can see that there is one user currently running ODI job for AllBUs, VP_Go_To_Market, SFDC, AllPCCs, Working1 combination, and two other users are waiting in queue. The user that triggered the BR can go into this dashboard and see how many users are before him, what parameters are running, and who is running those jobs. The user doesn’t need to monitor his jobs, however; once the ODI job for a specific user has finished, he/she will get an email notification.


    Purging Old Extracts


    If you export and load data across applications, and use unique names for data files, you will need to delete those files from time to time. ODI job DELETESCENPLANBREXTRACTS does that. It uses a procedure that:

    • Scans specified database folder (specified by Server, App, DB options)
    • Finds file matching specified pattern (FileToDelete option)
    • Gets the file timestamp from its name, and compares to current time. If the file is older than specified number of days (DeleteOlderThanDays option), the file is deleted.




    Figure 6


    About The Author

    Dmitry Kryuk is an independent Hyperion consultant and Solutions Architect. He has helped clients including Google, Salesforce, Gilead, Amgen, Dolby and others to design and build their enterprise solutions. His work focuses mainly on Essbase, Planning, ODI, DRM, Oracle DB and custom-built solutions like the one discussed in this article.  He codes in Python, Jython, Java, and PL/SQL.  For more insight from Dmitry please visit his blog:


    Note: This article represents the expertise, findings, and opinion of the author.  It has been published by Oracle in this space as part of a larger effort to encourage the exchange of such information within this Community, and to promote evaluation and commentary by peers. This article has not been reviewed by the relevant Oracle product team for compliance with Oracle's standards and practices, and its publication should not be interpreted as an endorsement by Oracle of the statements expressed therein.