Forum Stats

  • 3,768,921 Users
  • 2,252,874 Discussions


streamset oracle CDC to hdfs/hive meta store data stream

4160383 Member Posts: 1
edited Jan 28, 2020 9:28AM in Big Data Connectors/Hadoop


I have created a data stream using streamset tool available in Oracle Big data appliance. I want to load data from oracle 12 to hadoop/hive. I have used oracle CDC, hive metastore, and other hadoop/HDFS components a instructed in streamset documents. Every new data inserted in oracle is being flowed successfully to data file in HDFS but data is not available to query until  pipeline is stopped. I think the root cause is streamset is loading data in a temp file named "_tmp_*" in HDFS whereas I have specified filename "SDF_*".  Streamset rename the file from temp to actual only while stopping the pipeline.  This is not stream, it is batch data load.

Can you please suggest how can I get fix it. I need streamed data immediately queriable   from hive table.