Categories
Update / Inserts in Data-Sync

Summary
Looking for the count of records Inserted & Updated
Content
Is there a way to find out how many records were Inserted and Updated. Separate counts not total counts.
Version
Oracle Analytics Cloud Data Sync (2.5)
Answers
-
Maybe this is it.
Looks like 909 inserted and 5000 updated
LOG: C:\OAC\OACDataSync_V2_5\log\BHS_OAC_DS_PSOFT-VOUCHER_DATA.25940821
Truncate Stage TableSQL: TRUNCATE TABLE "S1#FACT_VOUCHER_PO"Stored Procedure: falseContinue on error: falseNumber of retries: 1361 GLOBAL 2019-04-28 07:02:29.707 EST Reader1 - 5909 records processed, total time: 4 second(s)375 millisecond(s), idle time: 0 millisecond(s)362 GLOBAL 2019-04-28 07:02:29.707 EST Reader1 - successfully completed368 GLOBAL 2019-04-28 07:03:12.893 EST Writer1 - 909 records processed, 0 records failed, total time: 47 second(s)514 millisecond(s), idle time: 4 second(s)328 millisecond(s) (9%), merge time: 41 second(s)592 millisecond(s) (87%)Merge counts:"Merge SQL INSERT_UPDATE_DATA_COPY" processed 909 row(s)"Truncate Stage Table" reported 0 row(s)369 GLOBAL 2019-04-28 07:03:12.893 EST Writer1 - successfully completed370 GLOBAL 2019-04-28 07:04:12.324 EST Writer2 - 5000 records processed, 0 records failed, total time: 1 minute(s), 46 second(s), 945 millisecond(s), idle time: 2 second(s)484 millisecond(s) (2%), merge time: 1 minute(s), 40 second(s), 582 millisecond(s) (94%)Merge counts:"Merge SQL INSERT_UPDATE_DATA_COPY" processed 5000 row(s)"Truncate Stage Table" reported 0 row(s)371 GLOBAL 2019-04-28 07:04:12.324 EST Writer2 - successfully completed372 GLOBAL 2019-04-28 07:04:12.324 EST 5909 records read.Read throughput, records per second: 1355.373 GLOBAL 2019-04-28 07:04:12.324 EST 5909 stage and 5909 total records written/processed.Write throughput, records per second: 69.374 GLOBAL 2019-04-28 07:04:12.324 EST Merge counts:"Merge SQL INSERT_UPDATE_DATA_COPY" processed 5909 row(s)"Truncate Stage Table" reported 0 row(s)0 -
Unfortunately there is no direct way of figuring it out. Data Sync uses merge statements to do insert or update in a single sql statement, and from the rows affected (reported in the log file), one would not be able to say which is what. Data Sync uses two writers to push the data, and hence the two counts - they do not reflect the insert/update counts.
If this information is extremely important to you, then you can do the following (using version 2.6)...
1. create two new columns called PROCESS_ID(NUMBER(38)) and DML_CODE(VARCHAR(1) using Data Loads->Column Mapping->Manage Target Columns->New Target Column.
2. Assign runtime defaults to these two new columns by clicking on Target Expression. For PROCESS_ID column choose the default as %CURRENT_PROCESS_ID and for DML_CODE choose the default as %DML_CODE.
When you run the dataload next time, PROCESS_ID will be populated with the run's processid and DML_CODE will have 'I' for inserts and 'U' for UPDATES.
Then you can identify the number of inserts/updates for any run with a simple select statement like so:
select count(*), DML_CODE from my_table where process_id = ? group by DML_CODE
Hope this helps!
0 -
Thanks for the help. Is it possible to access the refresh dates outside data-sync?
0 -
No its not possible. The metadata, and the runtime data are stored in a file based relational database construct called JavaDB, and its not possible to query this database from outside.
However, if you want to use it anywhere in the dataload routines, you could create a parameter (under Parameters tab). Choose Timestamp as the Data type. Choose Runtime variable, and then you can choose SOURCE_REFRESH_TIMESTAMP or TARGET_REFRESH_TIMESTAMP and choose the format. You can refer to the parameter anywhere in the dataload routines by using a prefix %%. For example, if you created MY_PARAMETER, you can refer to it as %%MY_PARAMETER.
0 -
We use data (PSOFT) that is always one day old. Our Prod data is moved into our FINT DB at 12:01AM every night. At the 7:00 AM that morning Data-Sync runs. Our counts are always off because of the refresh date and the "Entered/Update" dates in Data-Sync is off by 12 hours or more. We need a way to compensate for this difference. Right now I must go in manually and change the refresh date everyday from 7:00AM to 12:01 AM to get all records.
0 -
We want to alter the SOURCE_REFRESH_TIMESTAMP and TARGET_REFRESH_TIMESTAMP from its set time (7:00 AM) to 12:01 AM) for each FACT table
0 -
Thats easy - to compensate, you have two ways (in general).
1. If your source DB's timezone is different than the machine where DS runs, then be sure to specify Timezone information for the connection - this way DS will automatically compensate for the time difference between the timezones.
2. If you want to manually specify how much of data you want to re-extract (in your example 12 hours), under Jobs->Connectivity Parameters->Source Connection->Prune Time->Set it to 12 hours. Lets say you last ran the job at 12:01 April 28, then the current run will ask for changed records since 12:01 April 28th. With the prune time of 12 hours it will ask for all data changed sunce 00:01 April 28th (offset by 12 hours).
I am guessing the second option is what you want.
0 -
You cannot update it from outside. Manually, yes, you can go to connections -> refresh dates, and update the value. But this will not be sustainable, as you will be doing it prior to every run. Best to use the prune time concept at the job level. This will apply to all the tables being populated in a project.
0 -
We want to alter the SOURCE_REFRESH_TIMESTAMP and TARGET_REFRESH_TIMESTAMP from its set time (7:00 AM) to 12:01 AM) for each FACT table
0 -
Thank You So Much!!!! This is a great help. BTW is this information documented anywhere?
0