2 Replies Latest reply: Jul 19, 2014 3:50 PM by dvohra21 RSS

    Load JSON from HDFS into ORACLE via OSCH




      I'm trying to find a way to load JSON-Data stored in HDFS into Oracle Database via Oracle SQL Connector for HDFS (OSCH).

      The most promissing way seemed to my to set up an External HIVE table and create an external oracle table on top of it with the oracle hdfs external table tool:


      add jar hive-json-serde-0.2.jar ;

      create external table owners_test2 (

        col1 string, col2 double, col3 string



        serde 'org.apache.hadoop.hive.contrib.serde2.JsonSerde'

      LOCATION 'hdfs://hadoopHDFSCluster/my_json_file_folder' ;

      hadoop jar /<path_to_orahdfs>/orahdfs.jar oracle.hadoop.exttab.ExternalTable -conf /<path_to_my_conf>/table_conf.xml -createTable

      But i'm getting the error:

      2014-05-12 16:41:25,927 INFO [main] metastore.HiveMetaStore (HiveMetaStore.java:newRawStore(404)) - 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore

      2014-05-12 16:41:25,955 INFO [main] metastore.ObjectStore (ObjectStore.java:initialize(222)) - ObjectStore, initialize called

      2014-05-12 16:41:26,178 INFO [main] DataNucleus.Persistence (Log4JLogger.java:info(77)) - Property datanucleus.cache.level2 unknown - will be ignored

      2014-05-12 16:41:26,697 INFO [main] metastore.ObjectStore (ObjectStore.java:getPMF(267)) - Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"

      2014-05-12 16:41:26,732 INFO [main] metastore.ObjectStore (ObjectStore.java:setConf(205)) - Initialized ObjectStore

      2014-05-12 16:41:27,745 INFO [main] metastore.HiveMetaStore (HiveMetaStore.java:logInfo(447)) - 0: get_table : db=default tbl=owners_test2

      2014-05-12 16:41:27,834 INFO [main] HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(238)) - ugi=oracle ip=unknown-ip-addr cmd=get_table : db=default tbl=owners_test2

      2014-05-12 16:41:27,839 INFO [main] DataNucleus.Datastore (Log4JLogger.java:info(77)) - The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.

      2014-05-12 16:41:27,840 INFO [main] DataNucleus.Datastore (Log4JLogger.java:info(77)) - The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.

      oracle.hadoop.exttab.ExternalTableException: Unsupported Hive table serialization library org.apache.hadoop.hive.contrib.serde2.JsonSerde

        at oracle.hadoop.exttab.hive.HiveSource.initialize(HiveSource.java:262)

        at oracle.hadoop.exttab.hive.HiveSource.getDataSet(HiveSource.java:444)

        at oracle.hadoop.exttab.ExternalTable.doCreateTable(ExternalTable.java:1367)

        at oracle.hadoop.exttab.ExternalTable.run(ExternalTable.java:1673)

        at oracle.hadoop.exttab.ExternalTable.run(ExternalTable.java:1745)

        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)

        at oracle.hadoop.exttab.ExternalTable.main(ExternalTable.java:1851)

        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

        at java.lang.reflect.Method.invoke(Method.java:606)

        at org.apache.hadoop.util.RunJar.main(RunJar.java:208)


      any suggestions somebody? i literally found nothing in google!


      what are your prefered ways to load json-data from hdfs to oracle-tables?



        • 1. Re: Load JSON from HDFS into ORACLE via OSCH

          Oracle SQL Connector for HDFS' support for Hive tables is limited to Hive tables over delimited text files, as documented here.   For JSON files such as in your example, you can create a new Hive table which uses the JSON SerDe to create a delimited text representation of the data, and then use Oracle SQL Connector for HDFS.  


          Alternatively, you can use Oracle Loader for Hadoop that can read Hive tables over any format Hive supports, and create Oracle Data Pump files from the JSON files.   Oracle Data Pump files can then be queried by Oracle SQL Connector for HDFS.

          • 2. Re: Load JSON from HDFS into ORACLE via OSCH

            Formats other than text file are not supported by OSCH.