Oracle Analytics Cloud and Server

Welcome to the Oracle Analytics Community: Please complete your User Profile and upload your Profile Picture

Self Replacing Data in Data Flow

Accepted answer
40
Views
3
Comments

Hi ,

I am looking to implement a manual tracker for one of ongoing project.

There is a time-series information that I would be manually upload to the system on a frequency basis

Basically Ive created a dataflow that look like below

The idea is that I run the query on the frequency basis to update the table. However Ive found it wont execute if I try to replace "Dataset A" as a save file and while using it as one of the source.

Unfortunately I wont be able to setup proper DB for this tracker to be use as a connection.

Seeking advise.

Thank you,

Regards

Best Answer

  • GabriellePrichard-Oracle
    GabriellePrichard-Oracle Rank 4 - Community Specialist
    Answer ✓

    Hello! Data flows will not allow you to read from and overwrite the same “Dataset A” within one data flow. Here's a pattern you can use instead that involves two data flows placed in a single sequence that can be run on a schedule -

    Data Flow 1

    1. Read Dataset A
    2. Write it out immediately as Dataset A_backup (in case you need to keep a copy)
    3. Union your “new portion of A” uploads with Dataset A
    4. Write the combined result to Dataset A_temp

    Data Flow 2

    1. Read Dataset A_temp
    2. Overwrite Dataset A with this refreshed dataset

    Place these two data flows in 1 sequence.

    Let me know if this works for you.

Answers

  • User_GBD0K
    User_GBD0K Rank 1 - Community Starter

    Hi Prichard,

    Thank you for the input, while the individual dataflow did work, when I placed them into a sequence it seem it would not execute properly.

    It seem to failed on the second Data Flow step.

    Below is the failed log for Sequence : Work A


    ************************************************************
    * 2025-05-21 11:52:33 'XXXX'.'Work A'
    ************************************************************
    74 INFO 2025-05-21 11:52:33.375 SGT Starting to process request, run id: 711F5E10946E19AC71E1282BBAEB2E after being in queue for 21 second(s), 366 millisecond(s).
    230 INFO 2025-05-21 11:52:39.792 SGT Will run up to 5 task(s) in parallel
    234 INFO 2025-05-21 11:52:39.826 SGT Creating necessary connections.
    238 INFO 2025-05-21 11:52:39.835 SGT Sources created.
    239 INFO 2025-05-21 11:52:39.841 SGT 143.156 (GB) of 2.50 (TB) used (5.592 %).
    248 INFO 2025-05-21 11:52:39.868 SGT Loading tasks for a new run.
    251 FINER 2025-05-21 11:52:40.03 SGT Loading of tasks completed
    256 INFO 2025-05-21 11:52:40.128 SGT A total of 3 task(s) need to be executed for this run.
    259 INFO 2025-05-21 11:52:40.144 SGT Executing tasks.
    306 INFO 2025-05-21 11:52:45.626 SGT Task "DFLW : Job 1" completed
    343 SEVERE 2025-05-21 11:52:51.027 SGT Execution of "dataflow : 'XXXX'.'Job 2' ('XXXX'.'Work A')" on BIJDBC failed. OBIS job id 40000000-5383-1a0b-8029-ac1f4c100000
    [nQSError: 46236] Dataset Service error during Bulk Finalize - {"prefix":"DSS","code":50000,"message":"DSS understood the request, but was unable to perform the operation due to an internal error","subMessage":""}
    [nQSError: 43224] The Dataflow "Job 2" failed during the execution.
    [nQSError: 43204] Asynchronous Job Manager failed to execute the asynchronous job.
    346 SEVERE 2025-05-21 11:52:51.061 SGT An error occurred, which could not be interpreted.
    352 SEVERE 2025-05-21 11:52:51.262 SGT An error occurred, which could not be interpreted.
    358 SEVERE 2025-05-21 11:52:51.362 SGT An error occurred, which could not be interpreted.
    362 SEVERE 2025-05-21 11:52:51.369 SGT An error occurred, which could not be interpreted.
    363 SEVERE 2025-05-21 11:52:51.377 SGT Task "DFLW : Job 2 - Work A" failed. Check the tasks log for more information
    386 INFO 2025-05-21 11:52:51.563 SGT Finished.
    390 INFO 2025-05-21 11:52:51.623 SGT
    ************************************************************
    * Summary
    ************************************************************

    392 INFO 2025-05-21 11:52:51.629 SGT Job Run Summary
    Total Number Of Tasks: 3
    Total Number Of Successful Rows across all tasks: 0
    Total Data Size across all tasks: .00 (BYTE)

    394 INFO 2025-05-21 11:52:51.635 SGT Job finishing up....
    398 ALL 2025-05-21 11:52:51.75 SGT The process took 18 Second(s) 216 Milli-second(s)
    405 INFO 2025-05-21 11:52:51.769 SGT Closing connections...
    410 ALL 2025-05-21 11:52:51.781 SGT The process failed.

    Below is the failed log for Dataflow : Job 2


    **************************************************************************
    * 2025-05-21 11:52:45 DFLW : Job 2 - Work A
    **************************************************************************
    339 INFO 2025-05-21 11:52:45.974 SGT "dataflow : 'XXXX'.'Job 2' ('XXXX'.'Work A')" on BIJDBC submitted with job id 40000000-5383-1a0b-8029-ac1f4c100000
    341 INFO 2025-05-21 11:52:51.01 SGT Finished execution with status Failed
    342 SEVERE 2025-05-21 11:52:51.014 SGT Execution failed.
    [nQSError: 46236] Dataset Service error during Bulk Finalize - {"prefix":"DSS","code":50000,"message":"DSS understood the request, but was unable to perform the operation due to an internal error","subMessage":""}
    [nQSError: 43224] The Dataflow "Job 2" failed during the execution.
    [nQSError: 43204] Asynchronous Job Manager failed to execute the asynchronous job.
    365 INFO 2025-05-21 11:52:51.386 SGT Execution time :18 second(s), 377 millisecond(s)
    367 INFO 2025-05-21 11:52:51.39 SGT
    ***********************************************************************************
    * Historical load summary for dataflow : 'XXXX'.'Job 2'
    ***********************************************************************************

    Number Of Successful Loads: 1
    Number Of Failed Loads: 0
    Load Duration: 14 second(s), 0 millisecond(s)
    Load Date: 2025-05-21 10:55:15.0
    Average Load Duration: 14 second(s), 0 millisecond(s)
    Minimum Load Duration: 14 second(s), 0 millisecond(s)
    Maximum Load Duration: 14 second(s), 0 millisecond(s)

    Any way I can set this as a sequence?

  • GabriellePrichard-Oracle
    GabriellePrichard-Oracle Rank 4 - Community Specialist

    You should be allowed to use a data flow output as another data flow input in a sequence. Do the task-level logs provide any more information?