1 2 Previous Next 16 Replies Latest reply: Apr 5, 2013 2:21 AM by 892844 RSS

    Cube Refresh Performance Issue

    892844
      We are facing a strange performance issue related to cube refresh. The cube which used to take 1 hr to refresh is taking around 3.5 to 4 hr without any change in the environment. Also, the data that it processes is almost the same before and now. Only these cube out of all the other cubes in the workspace is suffering the performance issue over a period of time.

      Details of the cube:
      This cube has 7 dimensions and 11 measures (mix of sum and avg as aggregation algorithm). No compression. Cube is partioned (48 partitions). Main source of the data is a materialized view which is partitioned in the same way as the cube.

      Data Volume: 2480261 records in the source to be processed daily (almost evenly distributed across the partition)

      Cube is refreshed with the below script

      DBMS_CUBE.BUILD(<<cube_name>>,'SS',true,5,false,true,false);

      Has anyone faced similar issue? Please can advise on what might be the cause for the performance degradation.

      Environment - Oracle Database 11g Enterprise Edition Release 11.2.0.3.0
      AWM - awm11.2.0.2.0A
        • 1. Re: Cube Refresh Performance Issue
          Nasar-Oracle
          Cube loading is a two step process:
          (1). Loading relational data into the cube (i.e., stored measures)
          (2). Aggregating that data in the cube

          You have to find out which step is taking more time now.

          Assuming you are using 11.2 version of db, take a look at the CUBE_BUILD_LOG table and compare the records for an earlier BUILD_ID vs. a recent one.

          CUBE_BUILD_LOG contains very useful information.

          .
          • 2. Re: Cube Refresh Performance Issue
            Nasar-Oracle
            One more thing.

            I noticed that you are doing AVG aggregation.

            Can you explain why is it being done, and for which dimension?

            The reason I am asking this is because SUM aggregation is quicker and in one case I was able to achieve Average aggregation of financial ADB (Average-Daily-Balance) data over TIME dimension hierarchy by only using SUM aggregation for TIME.

            That is why I want to know if that technique can help you or not.
            • 3. Re: Cube Refresh Performance Issue
              892844
              AVG is done for a few measures across all the dimensions. Can you pls elaborate on how that avg aggregation is achieved by just using SUM?
              • 4. Re: Cube Refresh Performance Issue
                Nasar-Oracle
                ADB data in the source fact table was at MONTH level. It was mapped to the ADB cube in olap.

                For example:
                FACT1 column mapped to MEAS1
                FACT2 column mapped to MEAS2
                etc.

                Required aggregation over time dimension was AVERAGE.
                Example:
                Quarter = ((Period1 x its time_span) + (Period2 x its time_span) + (Period3 x its time_span)) / (Quarter time_span)

                Q1 = ( (Jan ADB data) x 31 + (Feb ADB data) x 28 + (Mar ADB data) x 31) / 90

                To achieve AVG time aggregation, the fact columns in the ADB cube source view were multiplied with timespan and then mapped to olap stored measures. I had to join the FACT table with TIME dimension source sql table, as timespan column was in the TIME table and not in fact table.

                FACT1 x time_span  mapped to MEAS1
                FACT2 x time_span  mapped to MEAS2 etc.

                TIME aggregation for the cube was then set to SUM in olap.

                Now here is the trick.+

                A calculated measure was created corresponding to each stored measure.

                CALC_MEAS1 = CUBE.MEAS1 / "TIME".TIME_SPAN
                CALC_MEAS2 = CUBE.MEAS2 / "TIME".TIME_SPAN
                etc.

                This will show correct data at MONTH, QUARTER and YEAR levels.

                So it was simple and fast.
                • 5. Re: Cube Refresh Performance Issue
                  Nasar-Oracle
                  Forgot to mention that YTD calculation for the above ADB data required the sum of timespan.

                  OLAP Expression Syntax is shown below.


                  YTD_MEAS1 =
                  SUM(CUBE.MEAS1) OVER (HIERARCHY "TIME".TIMEHIER BETWEEN UNBOUNDED PRECEDING AND CURRENT MEMBER WITHIN ANCESTOR AT DIMENSION LEVEL "TIME".FISCAL_YEAR) / SUM("TIME".TIME_SPAN) OVER HIERARCHY ("TIME".TIMEHIER BETWEEN UNBOUNDED PRECEDING AND CURRENT MEMBER WITHIN ANCESTOR AT DIMENSION LEVEL "TIME".FISCAL_YEAR)

                  YTD_MEAS2 =
                  SUM(CUBE.MEAS2) OVER (HIERARCHY "TIME".TIMEHIER BETWEEN UNBOUNDED PRECEDING AND CURRENT MEMBER WITHIN ANCESTOR AT DIMENSION LEVEL "TIME".FISCAL_YEAR) / SUM("TIME".TIME_SPAN) OVER HIERARCHY ("TIME".TIMEHIER BETWEEN UNBOUNDED PRECEDING AND CURRENT MEMBER WITHIN ANCESTOR AT DIMENSION LEVEL "TIME".FISCAL_YEAR)



                  .
                  • 6. Re: Cube Refresh Performance Issue
                    892844
                    Interesting! Nice to know abt this!

                    In our case, the scenario is bit different as we have got the measure to be averaged across all the dimensions.

                    Another thing to be checked with you is, how the log entries get into CUBE_BUILD_LOG table. What we see when we refresh the cube through AWM or through the script directly in a database session it logs, however when we refresh by invoking a shell script (which includes the same cube build script) we dont see the log entries. Is there anything needs to be turned on to get all the log entries ..any thoughts??
                    • 7. Re: Cube Refresh Performance Issue
                      DavidGreenfield
                      Records are logged to CUBE_BUILD_LOG whether you build from DBMS_CUBE.BUILD or from AWM. (The situation for CUBE_REJECTED_RECORDS is different.)

                      Perhaps you are building from another schema? Records are sent to the CUBE_BUILD_LOG table in the invoker's schema, not the AW's schema. As an example, if you have an AW in the SCOTT schema and you execute the DBMS_CUBE.BUILD call logged in as SCOTT, then all log records will be inserted into SCOTT.CUBE_BUILD_LOG. But if you execute DBMS_CUBE.BUILD logged in as some master schema, OLAPDBA say, then the records would be inserted into OLAPDBA.CUBE_BUILD_LOG. In either case if the table doesn't exist, then the build will continue without error (and without logging).

                      On the general build slowdown, have you tried clearing the cube and rebuilding it from scratch? If you have been running incremental loads for a while, then cube build performance can slow down. Think about disk-fragmentation and you get the basic idea.
                      • 8. Re: Cube Refresh Performance Issue
                        892844
                        That's a perfect guess, David!!! Thanks for spotting that..we are running it from a different schema which does not have cube_build_log table.

                        Haven't tried clearing the cube and refreshing it from scratch as it is a production system, and is this something that we have to do on a regular basis?
                        • 9. Re: Cube Refresh Performance Issue
                          DavidGreenfield
                          Do you have a test environment where you can run a complete build on the same data to see how long it takes?

                          There are no hard and fast rules as to when to run a "complete build" (i.e. clear then load all records and aggregate from scratch) instead of an "incremental build". I certainly can't say something like "run a complete build once a month" since it depends on too many factors -- how often you update, initial cube size, how many records are added each build, number of dimensions, structure of hierarchies, types of modification, hardware, disk performance, etc. But as a common sense rule I would say you should run a complete build at the point where the incremental build becomes more expensive than a complete build would be. This requires trial and error.
                          • 10. Re: Cube Refresh Performance Issue
                            892844
                            Thanks for your assistance, David

                            Actually, test environment is not the exact replica in terms of volume. I think I would need to create a maintenance script through AWM in production to do the same? Is there any handy command that I can issue directly in the database?
                            • 11. Re: Cube Refresh Performance Issue
                              DavidGreenfield
                              To run a complete (i.e. clear and then reload from the fact table) build of cube MY_CUBE you can do this:
                              exec DBMS_CUBE.BUILD('MY_CUBE USING (CLEAR, LOAD, SOLVE)', add_dimensions=>false, parallelism=>5)
                              Obviously you can change the parallelism as you like. The key is to add the CLEAR step.
                              • 12. Re: Cube Refresh Performance Issue
                                Nasar-Oracle
                                Take a look at DBMS_CUBE.BUILD documentation at http://download.oracle.com/docs/cd/E11882_01/appdev.112/e16760/d_cube.htm#ARPLS218 and DBMS_CUBE_LOG documentation at http://download.oracle.com/docs/cd/E11882_01/appdev.112/e16760/d_cube_log.htm#ARPLS72789

                                You can also search this forum for more questions/examples about DBMS_CUBE.BUILD


                                David Greenfield has covered many Cube loading topics in the past on this forum.

                                Mapping to Relational tables

                                Re: Enabling materialized view for fast refresh method

                                DBMS CUBE BUILD

                                CUBE_DFLT_PARTITION_LEVEL in 11g?

                                Reclaiming space in OLAP 11.1.0.7

                                Re: During a cube build how do I use an IN list for dimension hierarchy?


                                .
                                • 13. Re: Cube Refresh Performance Issue
                                  892844
                                  Thanks Nasar and David. Have cleared the cube now, will be running the full build and check the timings.

                                  Thanks guys, will keep you posted!!
                                  • 14. Re: Cube Refresh Performance Issue
                                    892844
                                    Guys - The refresh time has come down to 2 hrs now. Thanks very much. Actually we are facing another performance issue regarding CPU utilization during cube refresh for whcih I have posted a new thread. If you have any ideas pls share.

                                    Thanks again, guys!
                                    1 2 Previous Next