14 Replies Latest reply: Apr 21, 2012 12:57 PM by robvarga RSS

    multiple layer of group by

    Johnny_hunter
      hello all:

      I learned how to use coherence Aggregator functions, this is an example:
      GroupAggregator aggregator = GroupAggregator.createInstance(prop, new BigDecimalAverage(target));
      Map<Object, Object> ret = aggregate(filter, aggregator);
      it works fine, the problem is that if I want to group by on more than one property, what is the best practise? Say, I want to group by on "date" and "currency", to figue out the average of "cash", and what is the proper data structure to hold the result?

      Thanks,
      John
        • 1. Re: multiple layer of group by
          robvarga
          Johnny_hunter wrote:
          hello all:

          I learned how to use coherence Aggregator functions, this is an example:
          GroupAggregator aggregator = GroupAggregator.createInstance(prop, new BigDecimalAverage(target));
          Map<Object, Object> ret = aggregate(filter, aggregator);
          it works fine, the problem is that if I want to group by on more than one property, what is the best practise? Say, I want to group by on "date" and "currency", to figue out the average of "cash", and what is the proper data structure to hold the result?

          Thanks,
          John
          Hi John,

          You can use an extractor which extracts the multiple attributes into a class in which equals is implemented by checking all the properties. I believe MultiExtractor does this, so you should be able to use
          GroupAggregator.createInstance(new MultiExtractor("getAttribute1,getAttribute2"), new BigDecimalAverage(target))
          to create your aggregator.

          Best regards,

          Robert
          • 2. Re: multiple layer of group by
            Johnny_hunter
            thanks Robert:

            however, in this case, what's the return type after running "aggregate"? Is it still Map<Object, Object> ?

            John
            • 3. Re: multiple layer of group by
              robvarga
              Johnny_hunter wrote:
              thanks Robert:

              however, in this case, what's the return type after running "aggregate"? Is it still Map<Object, Object> ?

              John
              Hi John,

              Yes, and no.

              For GroupAggregator Map<Object,Object> is actually Map<GroupKey, GroupAggregateFunctionOutput>. GroupKey is whatever the group key extractor extracts.

              MultiExtractor extracts List<Object> instances, so you can actually cast the aggregation result entry keys to List<Object>. The order of the extracted values in that list corresponds to the order of the original value extractors or getter method names specified to the MultiExtractor constructor.

              The value objects will still be the the aggregate function outputs for a group of entries with the same extracted attribute list as what the key is.

              Best regards,

              Robert
              • 4. Re: multiple layer of group by
                Johnny_hunter
                thanks Robert:

                I tried but couldn't get it work -

                >
                GroupAggregator aggregator = GroupAggregator.createInstance(new MultiExtractor(str), new BigDecimalAverage(target));
                Map<Object, Object> result = getNamedCache().aggregate(filter, aggregator); //line 2
                >

                line 2 seems never executed. I am wondering if there is sample code showing me how to correctly use MultiExtractor?

                Thanks,
                John
                • 5. Re: multiple layer of group by
                  User738616-Oracle
                  Hi John,

                  Assuming your value object has attributes : salary, age, state; using group aggregator you can find the average salary group by age and state as below:

                  GroupAggregator aggregator = GroupAggregator.createInstance(new MultiExtractor(new ValueExtractor[]{new ReflectionExtractor("getState"), new ReflectionExtractor("getAge")}), new BigDecimalAverage("getSalary"));
                  Object result = cache.aggregate(AlwaysFilter.INSTANCE, aggregator);

                  Hope this helps!

                  Cheers,
                  NJ
                  • 6. Re: multiple layer of group by
                    Johnny_hunter
                    thanks to all. Actually both ways worked:
                    GroupAggregator aggregator = GroupAggregator.createInstance(new MultiExtractor(str), new BigDecimalAverage(target));
                    result = getNamedCache().aggregate(filter, aggregator); //line 2
                    GroupAggregator aggregator = GroupAggregator.createInstance(new MultiExtractor(new ValueExtractor[]{new ReflectionExtractor("getState"), new ReflectionExtractor("getAge")}), new BigDecimalAverage("getSalary"));
                    Object result = cache.aggregate(AlwaysFilter.INSTANCE, aggregator);
                    The result should be a LiteMap type.

                    Thanks again,
                    John
                    • 7. Re: multiple layer of group by
                      robvarga
                      Johnny_hunter wrote:
                      thanks to all. Actually both ways worked:
                      GroupAggregator aggregator = GroupAggregator.createInstance(new MultiExtractor(str), new BigDecimalAverage(target));
                      result = getNamedCache().aggregate(filter, aggregator); //line 2
                      GroupAggregator aggregator = GroupAggregator.createInstance(new MultiExtractor(new ValueExtractor[]{new ReflectionExtractor("getState"), new ReflectionExtractor("getAge")}), new BigDecimalAverage("getSalary"));
                      Object result = cache.aggregate(AlwaysFilter.INSTANCE, aggregator);
                      The result should be a LiteMap type.

                      Thanks again,
                      John
                      Hi John,

                      result may not be a LiteMap, it can be whatever Coherence decides to instantiate and can change from version to version. All you can rely on is that it is going to be a Map.

                      Best regards,

                      Robert
                      • 8. Re: multiple layer of group by
                        Johnny_hunter
                        hi Robert:

                        Thanks for the follow up advice!

                        John
                        • 9. Re: multiple layer of group by
                          Johnny_hunter
                          hello all:

                          Now I apply the above to BigDecimalSum, the same problem occurred, why is that?
                          GroupAggregator aggregator = GroupAggregator.createInstance(new MultiExtractor(new ValueExtractor[]{new ReflectionExtractor("getProp1"), new ReflectionExtractor("getProp2")}), new BigDecimalSum("getProp3"));
                          
                          Object result = aggregate(filter, aggregator);
                          logger.debug("the_result_is="+result); //this line is not printed
                          thanks,
                          John
                          • 10. Re: multiple layer of group by
                            robvarga
                            Johnny_hunter wrote:
                            hello all:

                            Now I apply the above to BigDecimalSum, the same problem occurred, why is that?
                            GroupAggregator aggregator = GroupAggregator.createInstance(new MultiExtractor(new ValueExtractor[]{new ReflectionExtractor("getProp1"), new ReflectionExtractor("getProp2")}), new BigDecimalSum("getProp3"));
                            
                            Object result = aggregate(filter, aggregator);
                            logger.debug("the_result_is="+result); //this line is not printed
                            thanks,
                            John
                            Hi John,

                            there can be a couple of reasons why that line is not printed:

                            1. Your aggregate method throws an exception (I don't know what that method does).
                            2. The toString() of your result object throws an exception.
                            3. Your configuration disables debug level logging for that logger.

                            Best regards,

                            Robert
                            • 11. Re: multiple layer of group by
                              Johnny_hunter
                              Robert: Thanks for the response. You are right, my aggregate method threw an exception showing I was wrong using DecimalSum ( or other Decimal-prefix aggregation)
                              java.io.IOException: decimal value exceeds IEEE754r 128-bit range
                              I was wondering why this exception go silent until I used a try/catch block around the suspicious code? Is it a runtime exception or checked exception?

                              Thanks,
                              John
                              • 12. Re: multiple layer of group by
                                robvarga
                                Johnny_hunter wrote:
                                Robert: Thanks for the response. You are right, my aggregate method threw an exception showing I was wrong using DecimalSum ( or other Decimal-prefix aggregation)
                                java.io.IOException: decimal value exceeds IEEE754r 128-bit range
                                I was wondering why this exception go silent until I used a try/catch block around the suspicious code? Is it a runtime exception or checked exception?

                                Thanks,
                                John
                                Hi John,

                                It is a checked exception, but exceptions from serialization are wrapped into a RuntimeException because the NamedCache methods do not declare checked exceptions, as any exception occuring during those methods indicate abnormal conditions.

                                In this case, you were trying to use a BigDecimal which could not be represented with 128bit. There is a problem that you are not able to substitute your own serializers for BigDecimal and BigInteger at the moment because they are handled as special cases, thus you cannot provide a serializer which would not be restricted by using the POF T_DECIMAL128 and T_INT128 types which POF uses for BigDecimal and BigInteger.
                                I filed an improvement request filed for it a while ago which got the JIRA id COH-5308 and which when implemented would allow you to use your own PofSerializers for BigDecimal and BigInteger, but I believe it is not implemented/released yet and I believe it is not scheduled for 3.7.x.

                                As your test shows, some other out-of-the-box functionality which cannot be worked around (out-of-the-box BigDecimal aggregators) is hindered by it, so it makes it a bit more serious than just not being able to transmit user created BigDecimals, so if you file a service request to release the fix for COH-5308 in a patch to 3.7, you may be able to get it. But don't take that as a promise, that is just my opinion that if you have an actual use case for it, it may raise the prioritization of it.

                                It is possible to override type resolution for objects so you could plug in your own ConfigurablePofContext subclass instead of ConfigurablePofContext which handles these two classes as a special case before falling back to the super implementation but it would be a bit less performant than if Coherence handled it on its own without an additional method delegation.

                                Best regards,

                                Robert

                                Edited by: robvarga on Apr 18, 2012 4:19 PM

                                Edited by: robvarga on Apr 18, 2012 4:23 PM
                                • 13. Re: multiple layer of group by
                                  Johnny_hunter
                                  Robert: thanks for the very educational comments about the Decimal aggregation's limitation. My application, however, doesn't really require to use decmials. I used Double-* instead, and they work fine.

                                  I am more interested in the excetion, according to you, it's transformed to a runtime exception as named cache doesn't declare checked exceptions. So a runtime exception is just "quietly" swallowed by JVM when it takes place? Sorry this is more like a java question rather than a coherence one.

                                  Thanks,
                                  John
                                  • 14. Re: multiple layer of group by
                                    robvarga
                                    Johnny_hunter wrote:
                                    Robert: thanks for the very educational comments about the Decimal aggregation's limitation. My application, however, doesn't really require to use decmials. I used Double-* instead, and they work fine.

                                    I am more interested in the excetion, according to you, it's transformed to a runtime exception as named cache doesn't declare checked exceptions. So a runtime exception is just "quietly" swallowed by JVM when it takes place? Sorry this is more like a java question rather than a coherence one.

                                    Thanks,
                                    John
                                    No, a runtime (non-checked) exception is not swallowed by the JVM, it is thrown and you can catch it, but the compiler does not force you to either catch it or declare it being thrown.

                                    If you don't catch a non-checked exception, then it is going to be thrown up on the call stack until either a catch block catches it or it is thrown out of the run() method of the current thread instance which resides at the top of the stack trace. If it is indeed thrown out, then that thread dies and the uncaught exception handler is going to get it (if you don't configure one, then the default implementation just logs the resulting ThreadDeathError to the standard error stream.

                                    Coherence does not declare checked exceptions because any exception arriving from NamedCache methods represents an abnormal situation (serialization errors, out of memory, user code throwing a RuntimeException, sent code interrupted because of timeouts), therefore it would be unreasonable and also would make development quite cumbersome if Coherence declared a checked exception for it. If user code throws such checked exceptions, Coherence catches them and wraps them into a RuntimeException but it does propagate it.
                                    Things like nodes coming and going are not abnormal scenarios, and Coherence does not throw exceptions for that case, it is designed to recover the operation on its own in case the member set changes.

                                    Best regards,

                                    Robert


                                    ------
                                    "You feel your eyelids becoming heavy, and your mouse pointer irresistibly drawn to the [Correct] and [Helpful] buttons..."
                                    :-)