PeriodRolling over PeriodRolling? - rare aggregation requested

Karol Kanicki · Jul 8, 2020 9:22AM

hi fellows,

I was requested to aggregate the data as per the screenshot:

2020.07.08.aggregation_for_SC.jpg

cells of 'current result' row are calculated as: (up-to-date aggregated sum of metric 1) - (up-to-date aggregated sum of metric 2)

cells of 'shall be' row are calculated as: up-to-date aggregated ((up-to-date aggregated sum of metric 1) - (up-to-date aggregated sum of metric 2)) --- hope that makes sense : )

the question is: how to get such result in OBIEE? additional difficulty is that this shall be done at RPD level. What you see at the screenshot is not an actual report, rather some data that allows you to understand what kind of aggregation I want to implement. The actual report is UNIONED report of many rows with specific format and, for example, I cannot add months there (then I doubt RSUM can be used). From screenshot above I'd only see 1 cell with '16' value there.

I thought that nesting periodRolling would be the solution, the problem is that this isn't allowed - I can create logical column that uses 'Derived from existing columns' formula, where both, that formula & formulas of existing columns use PeriodRolling function, but then it fails when used at the report returning error:

State: HY000. Code: 10058. [NQODBC] [SQL_STATE: HY000] [nQSError: 10058] A general error has occurred. (HY000)

State: HY000. Code: 43113. [nQSError: 43113] Message returned from OBIS. (HY000)

State: HY000. Code: 43119. [nQSError: 43119] Query Failed: (HY000)

State: HY000. Code: 42023. [nQSError: 42023] Illegal aggregate nesting in query. (HY000)

any help would be appreciated, thanks in advance

P.S.:

OBIEE version: 12.2.1.4.0 (Build BIPS-20200101165857 64-bit)

DB with data: Oracle 11.2.0.3.0

Mark.Thompson · Jul 8, 2020 3:38PM

Karol, although I do not have a solution for you, I do have a different way for you (and whomever requested this calculation) to think about the problem.

First, here is your original example, broken out just a bit more to provide additional clarity.

And, for the sake of discussion, let's add a fifth observation, with '5' for metric 1, and '2' for metric 2.

The method of achieving the result of 16 in the highlighted cell in your original example, and the 29 in the additional example, is actually a progressive weighted sum of the differences between the raw numbers in metric1 and metric2.

Let's take a look at the first example, with four observations, and calculate the results as a progressive weighted sum.

When viewed this way, we see that the earlier observations are weighted more heavily than later observations. In the initial 4-observation example, the last observation of 8 (i.e. '9' for metric 1 minus '1' for metric 2) represents only 10% of the total weightings (1 out of 1+2+3+4), while the first observation represents 40% of the total weightings.

Now let's check the math with 5 observations. Using your original calculation method, we saw that the final value '29' would be the result if we add an additional (5th) observation of '5' for metric1 and '2' for metric2. Let's use the alternate progressive weighted sum calculation method and see if we get that same result.

Look at the weighting multiples (red numbers). Those multiples come straight from the algebra required to solve the problem. (I'm sure my 10th grade Algebra II teacher, Mr. Alspaugh, would be proud.) If these 5 observations represented days, with observation #5 occurring today and observation #1 occurring 4 days ago, then the 4-day-old observation #1 would be weighted 5x as heavily as the observation taken today.

And now let's go one step further, and imagine that there are not just four or five observations of metric1 and metric2, but there are instead 30,000 observations, representing 30,000 chronologically arranged time periods, of any grain, anything from 30,000 microseconds up to 30,000 centuries. The end product, that final number out at the end (the '16' in your four-observation example), will be a sum, not of four or five weighted numbers, but of 30,000 weighted numbers. By extending the logic of the two progressive weighted sum examples shown above, we know that the first observation on that chronological continuum must receive a weighting multiple of 30,000, while the last observation on that continuum will receive a weighting multiple of 1.

I must admit that I'm intrigued by a model that, if extended to a rather silly but mathematically sound conclusion, could say this: We have 30,000 observations. Those observations represent one value from each day of the past 82+ years, starting on May 20, 1938 . And, based on the unalterable reality of the mathematics, the date of May 20, 1938 contributes 30,000 times as much to the final product as does today, July 8, 2020.

Is that a silly example? Of course it is. But that's the mathematical reality of the algebra underlying the requirement.

As I mentioned, I have no OBIEE RPD solution for the problem. I'll leave it to you to figure out how to generate the weighting multiples, and the rest will be quite simple, because at that point it will only require a single RollingPeriod function against the weighted component observations, i.e. rollingperiod((metric 1 minus metric 2) * weighting factor).

Before starting down that path: Are we sure that whoever made this request understands the implications of the math behind it? Because frankly, I'm unable to imagine a scenario in which earlier observations should be weighted more heavily than later observations, and indeed that a daily observation taken on May 20, 1938 should contribute 30,000x as much to the final product as a daily observation taken on July 8, 2020.

But that's what the algebra says.

Gianni Ceresa · Jul 8, 2020 3:50PM

If you get rid of one of the PERIODROLLING calculation, you can just use the one PERIODROLLING you are left with and it will work.

You can achieve the same result as PERIODROLLING by modelling a join not with fact.date = date_dim.date, but as a BETWEEN.

In this way the joins will return a PERIODROLLING value by default without calling PERIODROLLING, and you can then freely use any other time serie function or aggregation.

Karol Kanicki · Jul 9, 2020 4:42PM

@Mark.Thompson :

that's interesting theory, let me think about it a bit more during the weekend, maybe over a beer or two : )

I will explain you the business case then too.

Karol Kanicki · Jul 9, 2020 4:44PM

@Gianni Ceresa, thanks a lot. I forgot about this old trick with playing with join condition, been ages since I've used it last time. Eventually, I almost made it work, not 100% though, let me sketch the dataset again & I will explain where I do have an issue:

Karol Kanicki · Jul 9, 2020 5:21PM

...so, here it is:

and here goes the link to the googlesheet itself:

https://docs.google.com/spreadsheets/d/1PQQjR5Zy4bqa6U0OVRXnndyhGQijWVsZnVmYw35zOSs/edit?usp=sharing

is there a way to make the periodRolling actually stop rolling at the moment of time when metrics don't have values anymore?

my month filter will equal to var_current_month, but in many cases (let's say: for many countries) the metrics will have 0s for last couple months, so the result would get heavily skewed if I'm not able to disable rolling for those months

Karol Kanicki · Jul 27, 2020 10:49AM

in case anybody is interested if solution was found to this: no, it was not, it had to be solved in ETL eventually

Oracle Analytics Cloud and Server

Categories

PeriodRolling over PeriodRolling? - rare aggregation requested

Answers

@Mark.Thompson :