Oracle Business Intelligence Applications

Welcome to the Oracle Analytics Community: Please complete your User Profile and upload your Profile Picture

Refreshing OBIA DW database in lower environments from Production

Received Response
22
Views
5
Comments

Hello

Our BI implementation uses Oracle R12 ERP as a souce and Oralce 10G as a data warehouse database. we use Informcatica and DAC to load data from RP to DW using OBIA modules.

Lower environments of our source ERP database are refreshed occasionally, and we are forced to truncate our Data warehouse and run a Full load of the all the data from Dev/QA Source environments. As the size of the production data increases, this Full load has started to take 2-3 weeks to complete and it takes down entire environment for a significant time and limits our development. Can we also refresh Data warehouse database along with the source ERP, so we can continue running incremental ETLs in lower environments? What all steps are involved in doing this? How do we sync Informatica (sequence Generators etc) and DAC?

Please help with identifying what issues we might run into and high level steps we need to take.

Thanks!

Answers

  • rmoff
    rmoff Rank 6 - Analytics Lead

    Have you tried profiling the execution to identify which steps are taking the longest to complete? From there you can start to identify the reasons. Without that, it could be anything really.

  • vkm
    vkm Rank 5 - Community Champion

    yes, I tried to idnetify and tune long running jobs, but the data has grown so large that it takes weeks to complete the load. secondly, since we never do a full load in prod, this does not help in any way and we have to add extra storage (mainly staging and informatica temp space) and processor to load this data in lower environments. It just seems quicker to refresh the DW to match with the source and continue development as usual without putting in effort to full reload the data.

  • rmoff
    rmoff Rank 6 - Analytics Lead
    I tried to idnetify and tune long running jobs,

    And what was the result? Did you identify the stages that were taking the longest to run?

    the data has grown so large that it takes weeks to complete the load. 

    Yes, you mentioned that already. I don't see the relation to profiling the load steps though.

    we have to add extra storage (mainly staging and informatica temp space) and processor 

    Do you know that these were causing the problem? Can you quantify the improvement that each made?

  • vkm
    vkm Rank 5 - Community Champion

    yes, adding hints to some long running jobs helped and adding additional resources helped to finish the jobs. Our major constraint is the resources allocated to the lower environments. the resourses are allocated based on incremental load, since 90% of the time we load only incremental data. so I'm exploring the possibilities of avoiding to load the complete data set since we do not do that in production environments ever.

  • Thomas Dodds
    Thomas Dodds Rank 8 - Analytics Strategist

    Why would your development warehouse be based on a moving target as the development source?  That source is up and down and you are at their mercy ... sources for EDW should be much more stable than the DEV ERP.