Oracle Analytics Cloud and Server

Welcome to the Oracle Analytics Community: Please complete your User Profile and upload your Profile Picture

Trendline function returning inconsistent results depending on time columns in criteria

Received Response
11
Views
12
Comments
Mark.Thompson
Mark.Thompson Rank 6 - Analytics Lead

I created this analysis in Sample Sales:

2017-02-28_4-10-52.png

The formula for the Trend column is: TRENDLINE("Base Facts"."Revenue", ("Time"."Per Name Month"), 'LINEAR', 'VALUE')

It returns these results, with a gently sloping Trend line.

2017-02-28_5-33-12.png

For some of the other views in this analysis, I also need the Per Name Year column, so I added it to the criteria.  Now let's look at the results:

2017-02-28_5-36-12.png

The values in the Trend column, and of course its corresponding green trend line in the graph, are no longer the same as they were a moment ago, before adding the Year column.

But why?  Why does the formula return one set of results if Year is not in the criteria (picture 1), and a different set of results if Year is in the criteria (picture 2)?  The number of rows doesn't change, the revenue column doesn't change, so why should the results of the TRENDLINE function change? 

I can "fix" the formula by adding the Year component, and now I see the original values again:

TRENDLINE("Base Facts"."Revenue", ("Time"."Per Name Year","Time"."Per Name Month"), 'LINEAR', 'VALUE')

But again, why?  Why is that necessary, and why/how did it change the results?  My GUESS is that there is some type of smoothing technique being applied when the higher time level is included in the formula, but I can't find that in any documentation.

The documentation for the Trendline analytic function refers to this syntax:

TRENDLINE(<numeric_expr>, ([<series>]) BY ([<partitionBy>]), <model_type>, <result_type>, [<number_of_degrees>])

Looking for clues, I found a reference to the series parameter in this Oracle documentation (https://docs.oracle.com/middleware/bidv1221/desktop/BIDVD/BIDVD.pdf ) for the Data Visualization Desktop that states: "The numeric_expr represents the Y value for the trend and the series (time columns) represent the X value."   It presents this formula format: 

TRENDLINE(revenue, <em><strong>(calendar_year, calendar_quarter, calendar_month)</strong></em> BY (product), 'LINEAR', 'VALUE')

That syntax matches with my second formula, where I added Year to the series parameter.  But the documentation doesn't go far enough in explaining that series parameter, because it doesn't explain the very significant difference in results when more than one time level column is included in the analysis, nor does it explain why a reference to the additional time level column in the formula produces different results.

Can someone help me understand exactly what is happening here? 

«1

Answers

  • FTisiot
    FTisiot Rank 6 - Analytics Lead

    Hi,

    What you are looking is

    TRENDLINE(revenue, calendar_month BY calendar_year , 'LINEAR', 'VALUE')

    Anything that you include in your view that is not expressed in the series, it automatically gets assigned to the Grouping (by)

  • Mark.Thompson
    Mark.Thompson Rank 6 - Analytics Lead

    FTisiot, thank you for the reply.  Unfortunately, the "what's going on in the calculation" is still unclear to me.  "It automatically gets assigned to the Grouping (by)" isn't resonating with me so that I can say "AH-HA, now I get it!"  What does the 'Grouping (by)' really represent?  Are we grouping the months together by year, then comparing year-over-year results?  That doesn't sound logical, because if that was the case why would we need the monthly numbers at all?  Or is the TRENDLINE function somehow comparing similar periods in each year (i.e. month 1 to month 1, month 2 to month 2, etc...), and plotting a trend based on those monthly variances?   

  • [Deleted User]
    [Deleted User] Rank 2 - Community Beginner

    Just from a logical point of view:

    If you look at your trend over all of time - so just a progression of months without looking at the year level - then the green line on top is fine.

    If you look at the trend over the months in a year - so the second graph - then the green line is also fine.

    Your second chart simply takes the year into account in order to do the trend soit looks at it on a year-by-year basis.

  • FTisiot
    FTisiot Rank 6 - Analytics Lead

    Just to add to Christian's comment

    Every column that is not explicitly specified in the time series is creating a distinct subset of the data.

    If you substitute Year with Product then you'll end showing a different trend line for each product.

    That's the meaning of the group by.

    In you case is more difficult to understand since Year belongs to the same hierarchy of Month, so the lines "seem" continuos.

  • [Deleted User]
    [Deleted User] Rank 2 - Community Beginner

    Thanks @FTisiot I seem to be having issues in forming human-understandable sentences this morning

  • Mark.Thompson
    Mark.Thompson Rank 6 - Analytics Lead

    BINGO!!!

    Those are the exact words that make it clear, FTisiot.  Take Year out of the mix, substitute Product instead - show me a trend for each Product by Month.  As Christian noted, that is the spot-on way to think about it.  The inclusion of the Year column does indeed muddy the waters because it is the same "conceptual idea" as month.- the idea of the passage of time.

    Thank you!

  • [Deleted User]
    [Deleted User] Rank 2 - Community Beginner

    Human words hard. Sentence construct more hard.

    ....back to hacking commands

  • Mark.Thompson
    Mark.Thompson Rank 6 - Analytics Lead

    That reminded me of this: Way back in the 1970's when he was just becoming famous, Steve Martin had a gag in which he noted that he was a successful comedian because "some people have a way with words, while others ... oh ... not have way". 

    Your answers are usually right on target, Christian.  Carry on, sir!

  • [Deleted User]
    [Deleted User] Rank 2 - Community Beginner

    Hah thanks for the encouragement! It hurts though to get beaten in speaking english by an italian (hi @FTisiot of course I still love you, mate)

  • FTisiot
    FTisiot Rank 6 - Analytics Lead

    No worries @Christian Berg

    keep on trying...

    1kmkro.jpg