Forum Stats

  • 3,872,085 Users
  • 2,266,378 Discussions
  • 7,911,045 Comments

Discussions

extract JSON from CLOB

CJ Bell
CJ Bell Member Posts: 230 Bronze Badge

Hi I have JSON data in the format

I can't find an example in how to extract it ?

All the ones I see have the fieldname before the data not at the end like this

Any help much appreciated

Thanks

Chris

«1

Answers

  • Mike Kutz
    Mike Kutz Member Posts: 6,251 Gold Crown

    That data looks like valid JSON. Please edit your post and replace the image with quoted code that we can copy+paste for testing.

    Also, please add your desired output.

    Thx

  • User_H3J7U
    User_H3J7U Member Posts: 1,062 Gold Trophy

    I can't find an example in how to extract it ?

    Example:

    select treat(this_is_clob as json) as this_is_json
    from (select to_clob('{}') this_is_clob from dual);
    
    THIS_IS_JSON
    ------------
    {}
    

    Read the JSON Developers guide

  • CJ Bell
    CJ Bell Member Posts: 230 Bronze Badge
    edited Oct 20, 2021 4:03PM

    I would like each as a field so I can insert them into columns in a table


    {

    "data": [

    [

    "7764355",

    "23189291",

    "2ab56aaa25d1747f",

    "2021-04-01T01:43:10"

    ],

    [

    "7764355",

    "23189295",

    "2ab56aaa25d1747f",

    "2021-04-01T01:43:22"

    ]

    ],

    "meta": {

    "columns": [

    "session_id",

    "event_id",

    "visitor_id",

    "timestamp"

    ],

    "count": 4

    }

    }

  • mathguy
    mathguy Member Posts: 10,866 Black Diamond

    I can't find an example in how to extract it ?

    All the ones I see have the fieldname before the data not at the end like this


    That's because the JSON structure is non-standard. I assume by "extract" you mean either a SELECT statement or the creation of an actual (stored) table, where the column names are given at the end, and the data is given upfront as an array of arrays. In the data, I assume each nested array corresponds to a row in your "table", with the values (one for each column) given as the nested array elements. Correct?

    This arrangement is unusual, because each "row" represents an "entity" (an "object instance" in object-oriented terminology). Each "column" corresponds to a "property" or "attribute", with a name like session_id and a value like 7764355. In JSON, in most cases, each row is an object, not an array; and the number of attributes, their order and their names are known beforehand. Then "extracting the data" is easy.

    The only reason to have the JSON as you have it, that I can think of, is that the number and the names of columns in your output are dynamic - they are not known until you see the actual JSON string and you find the metadata at the end. Right? Well, you don't see many examples of that because there aren't many. Dynamic data structures aren't very common, especially in a relational database context.

    (By the way, the fact that the "metadata" is given at the end is irrelevant; it could be given first in the string, and it wouldn't make a difference. Same issue.)

    So, please clarify: Is it true that you can't assume you "know" the number of columns and their names beforehand, and that that's part of the JSON you must extract data from? If so, you won't find any simple answers. You will need to use some form of dynamic SQL, or other approaches to extract your data. It's not worth discussing such methods, though, until you confirm that this is indeed the real issue with your JSON.

    As an aside: TIMESTAMP is a reserved keyword in Oracle, so it will definitely not work as a column name. Also, if "metadata" is sent to you from whatever source, it would make more sense if it would also include information about data types: session id and event id seem to be numbers, visitor id seems to be string - or perhaps UUID, which is not a data type in Oracle database, and "timestamp" is indeed timestamp - or perhaps date in Oracle terminology.

  • Aaron L.
    Aaron L. Member Posts: 41 Red Ribbon

    I'm sure there are better ways, but this might get you started: ( I don't know how to make a script block here)

    DECLARE

     V_JSON_CLOB  CLOB := TO_CLOB('{"data": [["7764355","23189291","2ab56aaa25d1747f","2021-04-01T01:43:10"],["7764355","23189295","2ab56aaa25d1747f","2021-04-01T01:43:22"]],"meta": {"columns": ["session_id","event_id","visitor_id","timestamp"],"count": 4}}');

     V_COLUMN_COUNT   PLS_INTEGER        := NULL;

     V_DATA_COUNT    PLS_INTEGER        := NULL;

     V_COLUMN_ARR    DBMS_UTILITY.NAME_ARRAY;

     --

    BEGIN

     APEX_JSON.parse(V_JSON_CLOB);

     V_COLUMN_COUNT := APEX_JSON.get_count(p_path => 'meta.columns');

     DBMS_OUTPUT.PUT_LINE('COLUMNS');

     FOR i IN 1..V_COLUMN_COUNT LOOP

      DBMS_OUTPUT.PUT_LINE(APEX_JSON.get_varchar2(p_path => 'meta.columns[%d]', p0 => i));

      V_COLUMN_ARR(i) := APEX_JSON.get_varchar2(p_path => 'meta.columns[%d]', p0 => i);

     END LOOP;

     V_DATA_COUNT := APEX_JSON.get_count(p_path => 'data');

     FOR i IN 1..V_DATA_COUNT LOOP

      DBMS_OUTPUT.PUT_LINE('DATA SET '||i);

      FOR ii IN 1..V_COLUMN_COUNT LOOP

       DBMS_OUTPUT.PUT_LINE(V_COLUMN_ARR(ii)||': '||APEX_JSON.get_varchar2(p_path => 'data[%d][%d]', p0 => i, p1 => ii));

      END LOOP;

     END LOOP;

    END;

    /

  • CJ Bell
    CJ Bell Member Posts: 230 Bronze Badge

    They are always in that order of the columns in the metadata , the column names are what is supplied when we run the API , I can call the columns in my table anything I want

    I have done JSON before similar to this example


    obviously the issue is that the field names aren't in front of the data

    the format is exactly the same for each record in that session_ID is first , event_id is second etc

  • mathguy
    mathguy Member Posts: 10,866 Black Diamond

    OK - so you know beforehand that the columns will be exactly SESSION_ID, EVENT_ID, VISITOR_ID, TIMESTAMP? If so, why is that even part of the JSON (since you already know that information before you even look at the JSON)?

    If this is true - that you know ahead of time exactly what columns will be in the METADATA portion - then the problem isn't too difficult. Is that what you are saying though - that you know the column names without even looking at the JSON?

  • User_H3J7U
    User_H3J7U Member Posts: 1,062 Gold Trophy
    select nvl(m1,d1) f1, nvl(m2,d2) f2, nvl(m3,d3) f3, nvl(m4,d4) f4, nvl(m5,d5) f5, nvl(m6,d6) f6
    from (select '{"data": [["7764355","23189291","2ab56aaa25d1747f","2021-04-01T01:43:10"],
    ["7764355","23189295","2ab56aaa25d1747f","2021-04-01T01:43:22"]],
    "meta": {"columns": ["session_id","event_id","visitor_id","timestamp"],"count": 4}}' j from dual) t,
    json_table(t.j, '$.*[*]' columns(
      d1 varchar2(20 char) path '$[0]', d2 varchar2(20 char) path '$[1]', d3 varchar2(20 char) path '$[2]',
      d4 varchar2(20 char) path '$[3]', d5 varchar2(20 char) path '$[4]', d6 varchar2(20 char) path '$[5]',
      m1 varchar2(20 char) path '$.columns[0]', m2 varchar2(20 char) path '$.columns[1]', m3 varchar2(20 char) path '$.columns[2]',
      m4 varchar2(20 char) path '$.columns[3]', m5 varchar2(20 char) path '$.columns[4]', m6 varchar2(20 char) path '$.columns[5]'
    )) jt
    order by nvl2(m1,1,2)
    ;
    
    F1                   F2                   F3                   F4                   F5                   F6                  
    -------------------- -------------------- -------------------- -------------------- -------------------- --------------------
    session_id           event_id             visitor_id           timestamp                                                     
    7764355              23189291             2ab56aaa25d1747f     2021-04-01T01:43:10                                           
    7764355              23189295             2ab56aaa25d1747f     2021-04-01T01:43:22                                           
    

    @mathguy As an aside: TIMESTAMP is a reserved keyword in Oracle, so it will definitely not work as a column name.

    select reserved, res_semi from v$reserved_words where keyword='TIMESTAMP';
    
    RESERVED  RES_SEMI 
    ---------- ----------
    N         N        
    


    CJ Bell
  • mathguy
    mathguy Member Posts: 10,866 Black Diamond

    Here's one way you can extract the data, if you know the columns in advance. Note that I assume you also know the required data types in advance (rather than just the column names and their order).


    with
      inputs(i_clob) as (
        select
    '{
    "data": [
    [
    "7764355",
    "23189291",
    "2ab56aaa25d1747f",
    "2021-04-01T01:43:10"
    ],
    [
    "7764355",
    "23189295",
    "2ab56aaa25d1747f",
    "2021-04-01T01:43:22"
    ]
    ],
    "meta": {
    "columns": [
    "session_id",
    "event_id",
    "visitor_id",
    "timestamp"
    ],
    "count": 4
    }
    }'
        from dual
      )
    , prep (row_num, col_num, val) as (
        select row_num, col_num, val
        from   inputs cross apply
               json_table(i_clob, '$."data"[*]'
                    columns (
                        row_num for ordinality,
                        nested path '$[*]'
                            columns (
                                col_num for ordinality,
                                val path '$'
                            )
                    )
               )
      )
    select to_number(s) as session_id, to_number(e) as event_id, v as visitor_id,
           to_date(t, 'yyyy-mm-dd"T"hh24:mi:ss') as timestamp_
    from   prep
    pivot  (max(val) for col_num in (1 as s, 2 as e, 3 as v, 4 as t))
    ;
    
    SESSION_ID   EVENT_ID VISITOR_ID       TIMESTAMP_         
    ---------- ---------- ---------------- -------------------
       7764355   23189291 2ab56aaa25d1747f 2021-04-01 01:43:10
       7764355   23189295 2ab56aaa25d1747f 2021-04-01 01:43:22
    
    CJ Bell
  • mathguy
    mathguy Member Posts: 10,866 Black Diamond

    @User_H3J7U

    OK - I just tested and I was able to create a table with a column named TIMESTAMP. So your point is 100% correct.

    That is definitely a bug in Oracle db, whether Oracle agrees or not. Note that DATE is a reserved keyword and flagged as such if you try to use it as column name. Both DATE and TIMESTAMP should be reserved keywords, and for the same reason: they are required, verbatim keywords for date (resp. timestamp) literals.