What are the Limitations of OAC DataSync for incremental data load

Sowmiya Ranganathan · February 2020

Content

Hi,

Please share the limitations of OAC DataSync(version 2.6) while considering incremental data load.

Thank you,

Sowmiya R

Version

2.6

FPonte · February 2020

Hi.

Not sure there is one place we can get all of that but this can be a start.

https://docs.oracle.com/en/cloud/paas/bi-cloud/bilpd/troubleshooting-data-loading-issues.html

You will see in some places the limitations. Ex.:

I get error code 500 [BICS-DATALOAD] when loading data sets in Data Sync

If you see this error when loading data sets, check that none of your data set files exceeds the maximum data set storage limit of 50MB.

And this.

Top FAQ for Data Loading

https://docs.oracle.com/en/cloud/paas/bi-cloud/bilpd/top-faqs-data-loading.html

Ex.:

Where do I control the amount of memory allocated to Data Sync?
The default setting for Data Sync is 2GB. Data Sync initiates a separate Java process every time it runs a job. The memory requirement for the main Data Sync process is controlled by the –xmx parameter in datasync.bat/.sh. For individual jobs, memory values are read from conf-shared/jvm_parameters.txt.
If you’re loading large amounts of data, then you can allocate more memory. You can also reduce the memory allocation if the hardware has limited memory for the Data Sync process.

Hope this helps or give you some direction.

Cheers

Fernando

Sowmiya Ranganathan · February 2020

Hi Fernando,

In our scenario, while picking up the timestamp column from the on-prem database system the data type of the column is taken as DATE instead of timestamp. Because of this its affecting the incremental load. Any suggestions on this?

Thanks,

Sowmiya

FPonte · February 2020

Hi.

What kind of issue?

What the logs tell you?

I'm assuming you are doing something similar to presented on this video.

Loading Data from Relational Sources Using Data Sync

Best Regards

Fernando

Sowmiya Ranganathan · February 2020

Hi Fernando,

I'm following the same steps as mentioned above. In this case, the last_update column which is used for the incremental load is of type DATE. So when I schedule the job for every hour, for the first hour it's working fine for the upcoming hours it's not working because the last_update column just captures the date without timestamp. Are there any workarounds for this?

Thanks,

Sowmiya

User743400-Oracle · February 2020

DATE column does contain the timestamp part upto seconds. DATE does not store milliseconds, or timezone information.

Having said that, the incremental dataloads using Data Sync should not have any limits when it comes to # rows. It does store the last refresh timestamp in a date field, so its accurate to the second of the last job start. And while querying for incremental data, it does use that exact value. You can check this by going to connections->last refresh dates tab. It would list all the objects that got touched by a job, and when the last extract happened.

Job->Tasks->Task detail tab should show how many rows got processed.

For incremental to work correctly, ensure your load strategy is set to Update table, and verify that you have mentioned the filter columns correctly! An unique index on the user key columns will also assist in the upsert activities.

Let me know if you have any further questions about Data Sync that I can assist with!

FPonte · February 2020

Hi.

See Vijay reply bellow.

Cheers

Fernando

Oracle Analytics Cloud and Server

Categories

What are the Limitations of OAC DataSync for incremental data load

Content

Version

Answers