Categories
- All Categories
- 153 Oracle Analytics News
- 28 Oracle Analytics Videos
- 14.8K Oracle Analytics Forums
- 5.7K Oracle Analytics Idea Labs
- Oracle Analytics User Groups
- 54 Oracle Analytics Trainings
- 12 Oracle Analytics Data Visualizations Challenge
- 4 Oracle Analytics Career
- 2 Oracle Analytics Industry
- Find Partners
- For Partners
Incremental load through dataflows in Oracle Analytics DV 7.6

We currently receive incident extracts from ServiceNow via email in xlsx format for all incidents in our departments and manually upload these extracts into Oracle Analytics datasets. These datasets are then used to visually represent some key department metrics in an Oracle Analytics DV 7.6 workbook. We are now planning to automate the data ingestion process by using REST API connection available in Oracle Analytics DV 7.6 to load data directly from ServiceNow into an Oracle Analytics DV dataset.
However, we are facing a challenge because the volume of data is too large to be ingested through a single API call. Our proposed solution is to perform a one-time historical load for inactive incidents and then only load incremental data for new incidents or any updates to existing incidents going forward.
While we can easily identify new rows of data (i.e., new incidents) through our dataflows, the problem lies in tracking and loading the changed data since the last historical load. Specifically, we need to determine how to identify and capture modifications made to incidents that were already ingested before the historical load, to ensure our dataset stays up to date with all changes.
Is there a way to replace the existing incidents with new incidents when there are any status changes through dataflows, or would there be a better solution or approach for managing this incremental and changed data load effectively?
Best Answers
-
The 'Cache Reload Type' has 'Load New and Updated Data' option, so technically both new and updated data should be handled by reload? Are you saying you do not have any identifier columns to track updated data?
2 -
OAS 7.6 does not have the Incremental Load feature (Incremental Load was introduced in OAC July 2024 update). It should be available in March 2025 update of OAS.
7 -
With all due respect for @GayathriAnand-Oracle reply, even if that feature was available, it is pointless for your request.
The feature she mentions is only valid for caching a dataset and refreshing only new records or changed records instead of having to reload the whole dataset in the cache. It isn't your case.
Look at the video showing that feature and you easily see how it wouldn't help you at all!
https://www.youtube.com/watch?v=usEYPdVDR84&list=PL6gBNP-Fr8KXAOF9RgJIU5ykJvD8fHoxj
Now, the answer to your request is you doing manually the job of incremental loading.
You have 2 datasets:
- the full dataset of your incident extracts in a table, let's call this A
- a dataset based on an API call that contains new records and updates, let's call this B
How do you merge the 2? Well, that's you doing the job manually.
You need a dataflow that will take dataset B, identify new records and join them to your dataset A. Then the same dataset, or a different one, takes the dataset B, identify the not-new records, the updates to existing records, and replace the rows in the dataset A.
I don't like to perform such kind of work in a dataflow, because it gives you a complex dataflow to maintain with also limited control on how the job is performed (what happen if with the next release of OAS a piece of logic of how the dataflow is executed is changed? You will discover it after your dataflow failed or it did corrupt your data).
My way to solve this would be a simple dataset taking the dataset B from the API and writing it back to the same database where your table of dataset A is stored. Then in the database you can build a view doing the job of replacing the updated records and adding the new records. It's full SQL, full control on what happen and nothing can change the behavior of that code, not even an update of your database or OAS. Finally you have another simple dataflow that take as source the view you created and does insert it into the table representing dataset A (to replace it fully by the new state with the new and updated records all in the same place).
I may be wrong, but your currently marked correct answers to your request actually do not answer it at all, caching isn't going to do anything for your need, been it full or incremental, pointless…
2
Answers
-
GayathriAna7.nd-Oracle No, we do have an identifier column in the dataset that we can use the track the new and updated data.
Is this cache reload type option available in Oracle DV 7.6?
0 -
the volume of data is too large to be ingested through a single API call.
Not directly answering your question, but if the size is a challenge on the API call side, is incremental load really going to help you? If you don't have a different API call that only gives you new and updated rows, your size issue will still be there, because the incremental load on the OAS side will first require to receive all the data to be able to perform the INSERT-UPDATE load.
If you don't have a PK that can be used by the system to perform an incremental load, then you have to do it manually. If you have a database available it's easier: you save back the dataset to a database table, you create a view that perform the incremental load, you then run a dataflow that replace the dataset by the view (because a dataflow does create a new temporary table, insert the data, delete the existing table and then rename the temporary one to the real name, you can perform incremental loads easily with a simple dataflow).
If you don't have a database, you can do the same job in a dataflow, but it will be ugly: a dataflow is a lightweight ETL, not a full ETL…
0 -
@Gianni Ceresa We do have a different API call that gives us new and update rows. The only thing I wasn't sure of was whether the incremental run through DV dataflows would handle both new data and updated data.
0 -
Ok, in this case (with an API giving you only new and updated rows), caching isn't going to help you much in my opinion: you still don't have a full, unified, dataset at this point to benefit from a "smart" caching mechanism.
I believe that a dataflow is what you will need to do the job of adding the new records and replace the existing ones in the "historical" dataset with the data you freshly loaded from your API call.
Last time I had something like this, I handled it with a database view doing the incremental load part, and a dataflow replacing the existing dataset with the result of the view (that did perform the incremental logic).
My decision to handle it myself instead of any incremental logic "magic" available in dataflows, is mostly to know exactly what happen and how: I don't trust a blackbox telling me it will do some kind of incremental load but with a very wrong logic of what identifier is used and how (dataset and dataflow are very wrong in how they have been defined: you should set a identified column at the time you need to use it, not when you define a dataset, but well … that's a different topic).
1 -
We do have a full, unified, dataset in a table. Now what we would like to do is just do incremental through API to update the data on ongoing basis and not do a full load every time.
Would you mind explaining this below statement?
'I believe that a dataflow is what you will need to do the job of adding the new records and replace the existing ones in the "historical" dataset with the data you freshly loaded from your API call.'
We are on OAS 7.6 and like @GayathriAnand-Oracle confirmed above that OAS 7.6 does not have the Incremental Load feature (Incremental Load was introduced in OAC July 2024 update).0 -
Thanks for such a detailed explanation! I will try the solution you provided and will let you know.
0