1 2 Previous Next 24 Replies Latest reply: Mar 27, 2013 2:55 PM by DavidGreenfield RSS

    Cost-based aggregation


      Is it possible to find out how the cube is aggregated when you use cost-based aggregration?

      The cost-based aggregation is giving me reasonable load times, disk usage and query times. But I can't use this because one of my hierarchies changes rather often causing the complete cube to be re-aggregated. If I use level-based aggregation I can overcome this problem but I am having trouble finding the best configuration for which levels to aggregate on.

      Regards /Magnus
        • 1. Re: Cost-based aggregation
          Or would it be possible to exclude one level from an hierarchy from being pre-computed when using cost-based aggregation?

          Regards /Magnus
          • 2. Re: Cost-based aggregation

            I think you are asking about dynamically aggregating over a hierarchy (or some parts of a hierarchy, like a level or a member).
            AWM does not expose that kind of functionality, but its there in the OLAP.

            You can set the levels or even parent members for which the cube data is pre-aggregated. For all the other levels or parent-members, it will be dynamically aggregated. Its done through PRECOMPUTE.

            Here is some explanation. The example is about doing complete dynamic aggregation over a hierarchy. Then I mention other PRECOMPUTE conditions that you can use.

            Lets say you want the cube to be dynamically aggregated over a hierarchy at query time (instead of pre-aggregating over that hierarchy), you can set the PrecomputeCondition of the cube by selecting the dimension and setting PrecomputeCondition to NONE. If you describe the AGGMAP for this cube (in olap worksheet), you will then see PRECOMPUTE(NA) for that dimension. In case of uncompressed cubes, the AGGMAP may still show PRECOMPUTE(<valueset>), but that valueset will be empty.

            You can also query ALL_CUBES view to see the PRECOMPUTE settings. For more PRECOMPUTE options look at RELATION statement documentation in AGGMAP at http://docs.oracle.com/cd/E11882_01/olap.112/e17122/dml_commands_1006.htm#i1017474


            *<Metadata Version="1.3">*
            *<Cube Name="BNSGL_ACTV" Owner="BAWOLAP">*
            *<AWCubeOrganization PrecomputeCondition="BAWOLAP.PRODUCT NONE"/>*
            *</Metadata> !');*

            In addition to NONE, the other options for PRECOMPUTE are

            (1). ALL
            (2). AUTO
            (3). n%
            (4). levels of dimensions to be precomputed
            (5). a list of one or more parent members to be precomputed. For rest of the parent members, dynamic aggregation will be done at query time.
            (6). According to documentation, some conditional statements can be used also (although I have not tried it). For example:
            PRECOMPUTE (geography.levelrel ‘L3')
            PRECOMPUTE (LIMIT(product complement ‘TotalProd’))
            PRECOMPUTE (time NE ’2001')

            Note that there maybe a bug because of which the dimensions (over which the dynamic aggregation is desired) should be last dimensions in the aggregation order.

            For your situation, you should look at (4) or (5) or (6)

            • 3. Re: Cost-based aggregation
              Hi Nasar,

              Thank you for your reply!

              Is it possible to use the PrecomputeCondition together with cost-based aggregation (that would be the perfect solution for me)?

              If I look at your example - you only mention 1 dimension in the PrecomputeCondition clause. Will all other dimensions be cost-based aggregated then?

              Regards /Magnus
              • 4. Re: Cost-based aggregation
                The solution that I mentioned should work for all types of cubes - compressed or uncompressed (I think).

                Basically, you are just setting what to pre-aggregate in each dimension of the cube.

                You can do that same DBMS_CUBE.IMPORT_XML statement for each dimension where you want certain levels or members to be pre-aggregated and the rest dynamically aggregated.

                I do not understand what you mean by cost-based aggregation here.
                • 5. Re: Cost-based aggregation

                  By cost-based aggregation I mean the check box in the AWM ui. You can select between level based aggregation or cost-based aggregation (in the pre-compute tab).

                  Regards /Magnus
                  • 6. Re: Cost-based aggregation
                    Send me XML of your aw at nasar.ali-khan@oracle.com
                    with more details about your question.

                    I will post the answer here.
                    • 7. Re: Cost-based aggregation
                      Hi Nasar,

                      I think what Magnus wants to achieve is to use Precompute(on certain levels) but with Cost-based aggregation on many dimensions. I don't think it would be possible
                      I have been really wanting to do a level-based aggregation in OLAP while using the reporting advantage behind Cost-based-aggregation ( I.e Sparse Looping)

                      From previous Posts I think ORAcle recommends Cost-based-aggregation to level based, the reason being that Oracle OLAP applies Sparse-looping when looping over a composite to pull cube data from the cells. This feature comes into play
                      for cubes aggregated via cost-based-aggregation I think. You can correct me If I'm wrong.
                      My problem here is that Cost-based-aggregation isn't transparent to the developer, so yu just specify the % and let Oracle do it for you, which is good but sometimes we want to dig into the details for a better performance

                      Have you ( or anybody on the Forum) ever implemented a level-based aggregation strategy on any of your cubes?
                      if yes then any feedback on the cube performance during cube bulid/ Cube querying would be highly appreciated.
                      Plese also post some examples if you don't mind.

                      • 8. Re: Cost-based aggregation
                        Hi user9018701,

                        Yes - you are correct in your assumption. What you mention is what I am trying to achieve. The level-based aggregation gave me very poor query times and almost tripled the disk usage (even when I chose to pre-compute very few levels). That is why I am trying to use the cost-based aggregation but excluding certain levels.

                        Based on Nasars input I have been able to exclude one complete dimension from being pre-computed in a cost-based configuration. NOTE: You have to put the dimension you want to exclude at the end of the ConsistentSolve clause in the cube xml. This setup allows me to makes changes to that dimension without triggering a complete re-aggregation the cube - which is great.

                        The query times are ok but the ideal solution would be to just exclude certain levels from the pre-computation not a complete dimension.

                        Regards /Magnus
                        • 9. Re: Cost-based aggregation
                          Hi Nasar,

                          I have now re-built the cube with one dimension not being pre-computed at all and unfortunately the query times are not satisfying.

                          If you think that it is possible to do the same but only excluding certain levels it would be great.

                          Regards /Magnus
                          • 10. Re: Cost-based aggregation
                            Hi Magnus /Nasser/ David Greenfield,

                            I think the ultimate question to answer is the following ...

                            Is it possible to do a Cost-based-aggregation but only roll-up  certain dimension values?*

                            For example, lets assume the following scenario, My Sales Cube CubeA is dimensioned by dimension DimA (a hierarchical dimension as shown below) and my aggregation rule is SUM


                            Now I am implementing a cost-based-aggregation strategy (40% ) but the sales for dimension value AL1 should NOT be pre-aggregated.
                            I want the sales for dimension value AL1 to be calculated on-the-fly. Is this possible and if Yes how can it be done?
                            Please note that I am not interested in the Level-based technique

                            Thank you
                            • 11. Re: Cost-based aggregation
                              No, this is not possible. The only alteration to the cost based approach is to let one or more dimensions be entirely dynamic.

                              Could we please step back and look at the cube itself?

                              (1) How many dimensions do you have?

                              (2) How many members are there in the dimensions? (e.g. do any have more than 1 million members?)

                              (3) How have you partitioned the cube? (e.g. how many partitions are there?)

                              (4) How many leaf rows are loaded into a typical partition? You can get this information from AWM or from the OUTPUT column of the CUBE_BUILD_LOG table in your schema.

                              (5) How long does it take to do a complete build of your cube? What time would you consider reasonable?

                              (6) How long does it take to run your problematic queries? What time would you consider reasonable?
                              • 12. Re: Cost-based aggregation
                                Hi David,

                                Thank you for taking your time with this. I have found a workaround (it is not ideal - the ideal solution is mentioned in the previous post in this thread) in splitting the problematic dimension in two.

                                The dimension used to look like this:

                                A -> B -> C -> D -> E

                                Some members on the D level changes parent every few days causing a full re-aggregation of the cube. I have now split the dimension as this:

                                A -> B -> C

                                D -> E

                                This causes some problems when analyzing the data but it works.

                                Here are answers to your questions:

                                (1) How many dimensions do you have?

                                (2) How many members are there in the dimensions? (e.g. do any have more than 1 million members?)
                                From 20 up to 90000.

                                (3) How have you partitioned the cube? (e.g. how many partitions are there?)
                                Per day. Right now I have 100 partitions planning to grow the cube to 500 partitions.

                                (4) How many leaf rows are loaded into a typical partition? You can get this information from AWM or from the OUTPUT column of the CUBE_BUILD_LOG table in your schema.
                                6 000 000.

                                (5) How long does it take to do a complete build of your cube? What time would you consider reasonable?
                                Each partition takes approx 2 hours. That is reasonable. But a complete re-aggregation (when a dimension is changed) takes 90 mins per partition. Since I can run this in parallel on 6 cores a complete re-aggregation for 500 partition would take approx 5 days and that is not acceptable.

                                (6) How long does it take to run your problematic queries? What time would you consider reasonable?
                                When doing a level based aggregation the query time went up from a few seconds (for cost-based aggregation) to a few minutes.

                                Regards /Magnus
                                • 13. Re: Cost-based aggregation

                                  One quick observation.

                                  Number of partitions is too high.

                                  Can you reduce it, by partitioning at higher level?
                                  • 14. Re: Cost-based aggregation
                                    Magnus / User9018701,

                                    Here is summary of what David Greenfield explained to me, sometime ago.

                                    For Compressed Cubes, three types of aggregation are supported:

                                    *(#1) Percent Based*
                                    You set PrecomputePercent > 0 and PrecomputeCondition = null
                                    This is what is normally done.

                                    *(#2) ValueSet based*
                                    You set PrecomputePercent = 0,
                                    but then you have to specify the contents of the value sets for each dimension using PrecomputeCondition. In other words, you have to specify what levels or members you want to pre-aggregate for each and every dimension.
                                    This can get cumbersome as you have to specify each and every member or level for every dimension, which should be pre-aggregated.
                                    I have not tried it, so I don't know about cube loading performance and query performance.

                                    *(#3) Partial Percent Based*
                                    You set PrecomputePercent > 0
                                    and then set PrecomputeCondition of one or more dimensions to NONE.
                                    For example: PrecomputeCondition = "PROD NONE, CHANNEL NONE"

                                    This is what I mentioned earlier in this discussion.

                                    Note that for this #3 situation, you may need C patch if you are doing complete dynamic aggregation for more than one dimension.


                                    Unfortunately for Compressed-Cubes, it is not possible to set PreCompute %age > 0 and then use ValueSet based aggregation (i.e., select members or levels for one or two dimensions for pre-aggregation)

                                    The syntax for #2 is given below.
                                    NOTE: The syntax for the precompute condition is the same as the syntax for the FOR...BUILD construct in the DBMS_CUBE.BUILD script.
                                    If you need more help with the Precompute Syntax, then start another discussion thread.


                                    <Metadata Version="1.3">
                                    <Cube Name="BNSGL_BAL" Owner="BAWOLAP">
                                    PrecomputeCondition="BAWOLAP.DEPT WHERE HIER_LEVEL(BAWOLAP.DEPT.HIER_DEPT) = 'LV2_DEPT'"/>
                                                   .... (precompute conditions for all dimensions)
                                    </Metadata> !');
                                    1 2 Previous Next