Oracle Analytics Cloud and Server

Welcome to the Oracle Analytics Community: Please complete your User Profile and upload your Profile Picture

What are the Limitations of OAC DataSync for incremental data load

Received Response
171
Views
6
Comments

Content

Hi,

Please share the limitations of OAC DataSync(version 2.6) while considering incremental data load.

 

Thank you,

Sowmiya R

Version

2.6

Answers

  • FPonte
    FPonte Rank 6 - Analytics Lead

    Hi.

    Not sure there is one place we can get all of that but this can be a start.

    https://docs.oracle.com/en/cloud/paas/bi-cloud/bilpd/troubleshooting-data-loading-issues.html

    You will see in some places the limitations. Ex.:

    I get error code 500 [BICS-DATALOAD] when loading data sets in Data Sync

    If you see this error when loading data sets, check that none of your data set files exceeds the maximum data set storage limit of 50MB.

    And this.

    Top FAQ for Data Loading

    https://docs.oracle.com/en/cloud/paas/bi-cloud/bilpd/top-faqs-data-loading.html

    Ex.:

    Where do I control the amount of memory allocated to Data Sync?
    The default setting for Data Sync is 2GB. Data Sync initiates a separate Java process every time it runs a job. The memory requirement for the main Data Sync process is controlled by the –xmx parameter in datasync.bat/.sh. For individual jobs, memory values are read from conf-shared/jvm_parameters.txt.
    If you’re loading large amounts of data, then you can allocate more memory. You can also reduce the memory allocation if the hardware has limited memory for the Data Sync process.

    Hope this helps or give you some direction.

    Cheers

    Fernando

     

  • Sowmiya Ranganathan
    Sowmiya Ranganathan Rank 3 - Community Apprentice

    Hi Fernando,

    In our scenario, while picking up the timestamp column from the on-prem database system the data type of the column is taken as DATE instead of timestamp. Because of this its affecting the incremental load. Any suggestions on this?

     

    Thanks,

    Sowmiya

  • FPonte
    FPonte Rank 6 - Analytics Lead

    Hi.

    What kind of issue?

    What the logs tell you?

    I'm assuming you are doing something similar to presented on this video.

    Loading Data from Relational Sources Using Data Sync

    Best Regards

    Fernando

     

  • Sowmiya Ranganathan
    Sowmiya Ranganathan Rank 3 - Community Apprentice

    Hi Fernando,

    I'm following the same steps as mentioned above. In this case, the last_update column which is used for the incremental load is of type DATE. So when I schedule the job for every hour, for the first hour it's working fine for the upcoming hours it's not working because the last_update column just captures the date without timestamp. Are there any workarounds for this?

     

    Thanks,

    Sowmiya

  • User743400-Oracle
    User743400-Oracle Rank 3 - Community Apprentice

    DATE column does contain the timestamp part upto seconds.  DATE does not store milliseconds, or timezone information. 

    Having said that, the incremental dataloads using Data Sync should not have any limits when it comes to # rows.  It does store the last refresh timestamp in a date field, so its accurate to the second of the last job start.  And while querying for incremental data, it does use that exact value.  You can check this by going to connections->last refresh dates tab.  It would list all the objects that got touched by a job, and when the last extract happened.

    Job->Tasks->Task detail tab should show how many rows got processed.

    For incremental to work correctly, ensure your load strategy is set to Update table, and verify that you have mentioned the filter columns correctly!  An unique index on the user key columns will also assist in the upsert activities.

    Let me know if you have any further questions about Data Sync that I can assist with!

  • FPonte
    FPonte Rank 6 - Analytics Lead

    Hi.

    See Vijay reply bellow.

    Cheers

    Fernando