Forum Stats

  • 3,783,390 Users
  • 2,254,767 Discussions
  • 7,880,376 Comments

Discussions

Load JSON from HDFS into ORACLE via OSCH

967981
967981 Member Posts: 2
edited Jul 19, 2014 4:50PM in Big Data Connectors/Hadoop

Hi,

I'm trying to find a way to load JSON-Data stored in HDFS into Oracle Database via Oracle SQL Connector for HDFS (OSCH).

The most promissing way seemed to my to set up an External HIVE table and create an external oracle table on top of it with the oracle hdfs external table tool:

add jar hive-json-serde-0.2.jar ;

create external table owners_test2 (

  col1 string, col2 double, col3 string

)

ROW FORMAT

  serde 'org.apache.hadoop.hive.contrib.serde2.JsonSerde'

LOCATION 'hdfs://hadoopHDFSCluster/my_json_file_folder' ;


hadoop jar /<path_to_orahdfs>/orahdfs.jar oracle.hadoop.exttab.ExternalTable -conf /<path_to_my_conf>/table_conf.xml -createTable


But i'm getting the error:


2014-05-12 16:41:25,927 INFO [main] metastore.HiveMetaStore (HiveMetaStore.java:newRawStore(404)) - 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore

2014-05-12 16:41:25,955 INFO [main] metastore.ObjectStore (ObjectStore.java:initialize(222)) - ObjectStore, initialize called

2014-05-12 16:41:26,178 INFO [main] DataNucleus.Persistence (Log4JLogger.java:info(77)) - Property datanucleus.cache.level2 unknown - will be ignored

2014-05-12 16:41:26,697 INFO [main] metastore.ObjectStore (ObjectStore.java:getPMF(267)) - Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"

2014-05-12 16:41:26,732 INFO [main] metastore.ObjectStore (ObjectStore.java:setConf(205)) - Initialized ObjectStore

2014-05-12 16:41:27,745 INFO [main] metastore.HiveMetaStore (HiveMetaStore.java:logInfo(447)) - 0: get_table : db=default tbl=owners_test2

2014-05-12 16:41:27,834 INFO [main] HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(238)) - ugi=oracle ip=unknown-ip-addr cmd=get_table : db=default tbl=owners_test2

2014-05-12 16:41:27,839 INFO [main] DataNucleus.Datastore (Log4JLogger.java:info(77)) - The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.

2014-05-12 16:41:27,840 INFO [main] DataNucleus.Datastore (Log4JLogger.java:info(77)) - The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.

oracle.hadoop.exttab.ExternalTableException: Unsupported Hive table serialization library org.apache.hadoop.hive.contrib.serde2.JsonSerde

  at oracle.hadoop.exttab.hive.HiveSource.initialize(HiveSource.java:262)

  at oracle.hadoop.exttab.hive.HiveSource.getDataSet(HiveSource.java:444)

  at oracle.hadoop.exttab.ExternalTable.doCreateTable(ExternalTable.java:1367)

  at oracle.hadoop.exttab.ExternalTable.run(ExternalTable.java:1673)

  at oracle.hadoop.exttab.ExternalTable.run(ExternalTable.java:1745)

  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)

  at oracle.hadoop.exttab.ExternalTable.main(ExternalTable.java:1851)

  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

  at java.lang.reflect.Method.invoke(Method.java:606)

  at org.apache.hadoop.util.RunJar.main(RunJar.java:208)

any suggestions somebody? i literally found nothing in google!

what are your prefered ways to load json-data from hdfs to oracle-tables?

thanks!

Answers

  • Mannamal-Oracle
    Mannamal-Oracle Member Posts: 260 Employee

    Oracle SQL Connector for HDFS' support for Hive tables is limited to Hive tables over delimited text files, as documented here.   For JSON files such as in your example, you can create a new Hive table which uses the JSON SerDe to create a delimited text representation of the data, and then use Oracle SQL Connector for HDFS.  

    Alternatively, you can use Oracle Loader for Hadoop that can read Hive tables over any format Hive supports, and create Oracle Data Pump files from the JSON files.   Oracle Data Pump files can then be queried by Oracle SQL Connector for HDFS.

  • dvohra21
    dvohra21 Member Posts: 14,325 Gold Crown

    Formats other than text file are not supported by OSCH.

This discussion has been closed.