Data Lake & Services

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

streamset oracle CDC to hdfs/hive meta store data stream

4160383Jan 28 2020 — edited Jan 28 2020

Hi,

I have created a data stream using streamset tool available in Oracle Big data appliance. I want to load data from oracle 12 to hadoop/hive. I have used oracle CDC, hive metastore, and other hadoop/HDFS components a instructed in streamset documents. Every new data inserted in oracle is being flowed successfully to data file in HDFS but data is not available to query until pipeline is stopped. I think the root cause is streamset is loading data in a temp file named "_tmp_*" in HDFS whereas I have specified filename "SDF_*". Streamset rename the file from temp to actual only while stopping the pipeline. This is not stream, it is batch data load.

Can you please suggest how can I get fix it. I need streamed data immediately queriable from hive table.

Thanks

Ashish

Added on Jan 28 2020

#big-data-connectors-hadoop, #hadoop

0 comments

559 views

Data Lake & Services

streamset oracle CDC to hdfs/hive meta store data stream

Comments

Post Details