apply all dataset features to subject area

Marcelo Finkielsztein · February 2022

Organization Name (Required - If you are an Oracle Partner, please provide the organization you are logging the idea on behalf of):

Postmedia Network

Description (Required):

Implement all the features available for datasets, also on subject areas.

Use Case and Business Need (Required):

We use an RPD (data model) file to create the vast majority of our workbooks.

We notice there are several features that apply to datasets, but do not work when the workbook is created based on a Subject Area. including but not limited to:

insights
explain attributes (right-click explain on an attribute)
format date-time attributes

We would like our users to adopt DV technology, but we need to ensure they will not miss functionality that they see as "standard" when they create their a classic analyses.

Thanks,

Marcelo

Gabby Rubin-Oracle · March 2022

ML operations usually require denormalized datasets in order to be able to provide meaningful insights. In complex models with multiple join paths the size of queries required and variation of join paths will cause such processing to become unrealistic. As an example, this might be similar to doing 'Select * from Subject Area', which is not a query that most would consider running against their RPDs.

There are ways in which it can be mitigated, some are better than others. As an example, we can assume that ML algorithms will run on single entities - but the results might not fit the user's needs and will likely provide false insights in some cases. We actually performed multiple experiments on ways to automate and resolve it, and we might find automated solutions in the future but for now, the better approach is to ask the user to construct the denormalized selection of columns that we will use as a starting point (the system will do its automated feature selection on top of that) - this will minimize the 'Select *' nature and target the user needs better.

At this point the way that Explain and Auto Insights work, it does not have a step for the user to define a selection on top of a data model, so the way we achieve it is by creating a Local Subject Area dataset which is basically the denormalized scope that the user requests. We are considering possibly introducing something along these lines to the Workbook experience in the future, so the LSA dataset will be temporarily created as part of the user flow, but in any case, user intervention will be required.

There will be future ML features that will not have this limitation. As an example, explaining behaviors within a specific chart should be possible regardless of the data source since the chart query already provides a denormalized result set, but there might be other limitations that we are not yet aware of.

While I agree with your comment about migrating Classic customers to DV when it comes to required functionality, I'm not sure of the relevance to the ML features as those do not exist in Classic. It is an added value so it might promote some users to use DV and I understand that not having that for SA's reduces the added value. However, the reason is based on what is required to run an ML training and scoring process (or other statistical algorithms) and not because we do not agree on the value.

Oracle Analytics Cloud and Server Idea Lab

Categories

apply all dataset features to subject area

Submitted · Last Updated March 2023

Comments