Forum Stats

  • 3,825,200 Users
  • 2,260,480 Discussions
  • 7,896,437 Comments

Discussions

building XPath with the XML having the Namespace using PL SQL

Sandeep Sangameshwara-Oracle
Sandeep Sangameshwara-Oracle Member Posts: 10
edited Nov 11, 2013 10:58AM in SQL & PL/SQL

While trying to build the path of each node, when the XML has no namespace in it , the below code works fine providing the result

1~/

2~/person/

3~/person/age/

4~/person/homecity/

5~/person/name/

6~/person/homecity/lat/

7~/person/homecity/name/

8~/person/homecity/long/

But when the xml is changed to

<person xmlns="urn:person" xmlns:lat="urn:lat">
<name>Rob</name>
<age>37</age>
<homecity>
    <name>London</name>
    <lat>123.000</lat>
    <long>0.00</long>
</homecity>
</person>"

The result on executing the below code is resulting into only into below result

1~/
 2~/person/

In the XML provided above, XML name space is not constant and it may vary for every XML. My requirement is to parse the complete XML, where i can read the XML with namespace and get the result as mentioned below

1~/
2~/person/
3~/person/age/
4~/person/homecity/
5~/person/name/
6~/person/homecity/lat:lat/
7~/person/homecity/name/
8~/person/homecity/long/

Can you please help me resolving the issue mentioned. Thanks in advance. --Code Snippet below :

DECLARE
  l_File VARCHAR2(32000) := '<person>
<name>Rob</name>
<age>37</age>
<homecity>
    <name>London</name>
    <lat>123.000</lat>
    <long>0.00</long>
</homecity>
</person>';
 
 l_Where_Clause VARCHAR2(100) := '/*';
 l_Append_Var   VARCHAR2(100) := '/';
 
 TYPE Ty_Paths IS TABLE OF VARCHAR2(1000) INDEX BY PLS_INTEGER;
 l_Ty_Paths      Ty_Paths;
 l_Ty_Paths_Temp Ty_Paths;
 
 TYPE Ty_Verifier IS TABLE OF VARCHAR2(1000) INDEX BY VARCHAR2(1000);
 l_Ty_Varifier Ty_Verifier;
 
l_Prev_Query_Rec VARCHAR2(100);
l_Index_Num      NUMBER := 0;
l_Cur_Exec_Row   NUMBER := 0;
BEGIN
 l_Ty_Paths(Nvl(l_Ty_Paths.COUNT, 0) + 1) := l_Append_Var;
 l_Cur_Exec_Row := 1;
 
 --Dbms_Output.put_line('Before entering the loop');
 
 LOOP
   l_Ty_Paths_Temp.DELETE;
  SELECT DISTINCT REPLACE(l_Append_Var || '/' || t.Xml || '/', '//', '/') BULK COLLECT
   INTO   l_Ty_Paths_Temp
  FROM   (SELECT Xmltype(Extract(VALUE(e), '/').Getstringval()) .Getrootelement() AS Xml
           FROM   TABLE(Xmlsequence(Extract(Xmltype(l_File), l_Where_Clause))) e) t;
 
  l_Ty_Varifier(Nvl(l_Ty_Varifier.COUNT, 0) + 1) := l_Append_Var;
  --Dbms_Output.put_line('L_TY_PATHS_TEMP.Count::'||L_TY_PATHS_TEMP.Count);
  IF l_Ty_Paths_Temp.COUNT > 0 THEN
     l_Index_Num := Nvl(l_Ty_Paths.COUNT, 0) + 1;
     FOR i IN l_Ty_Paths_Temp.FIRST .. l_Ty_Paths_Temp.LAST LOOP
        l_Ty_Paths(l_Index_Num) := l_Ty_Paths_Temp(i);
        --Dbms_Output.put_line('L_INDEX_NUM::'||L_INDEX_NUM);
        --Dbms_Output.put_line('L_TY_PATHS(L_INDEX_NUM)::'||L_TY_PATHS(L_INDEX_NUM));
        l_Index_Num := l_Index_Num + 1;
     END LOOP;
  END IF;
  --Dbms_Output.put_line('L_TY_PATHS.Count::'||L_TY_PATHS.Count);
  --Dbms_Output.put_line('L_TY_PATHS.Count::'||L_CUR_EXEC_ROW);
 
  IF (NOT l_Ty_Paths.EXISTS(l_Cur_Exec_Row + 1)) OR (l_Cur_Exec_Row = l_Ty_Paths.COUNT) THEN
     --Dbms_Output.put_line('Exiting');
     EXIT;
  ELSE
     --Dbms_Output.put_line('Inside the Else part');
 
     l_Cur_Exec_Row := l_Cur_Exec_Row + 1;
     l_Append_Var   := l_Ty_Paths(l_Cur_Exec_Row);
     l_Where_Clause := l_Ty_Paths(l_Cur_Exec_Row) || '*';
  END IF;
 
  --To Display the record:
     --Dbms_Output.put_line(L_TY_PATHS.Count);
  END LOOP;
  IF l_Ty_Paths.COUNT > 0 THEN
    FOR i IN l_Ty_Paths.FIRST .. l_Ty_Paths.LAST LOOP
      Dbms_Output.Put_Line(i || ' record is ' || l_Ty_Paths(i));
   END LOOP;
 END IF;
 
END;
BluShadow

Best Answer

  • odie_63
    odie_63 Member Posts: 8,491 Silver Trophy
    edited Nov 11, 2013 11:18AM Answer ✓

    Thanks.

    If you have the schemas, then it may be easier to work on them directly.

    Here's the idea :

    1) Install "Oracle XML DB Manageability Packages", available on XML DB sample code page : http://download.oracle.com/otn/samplecode/xdb_util.zip

    It's a set of utilities that help us annotate XML schemas, deal with the underlying OR storage structure, and (this is the interesting part here) expose a few dictionary views to describe XSD structure relationally.

    (it's now a 12c built-in btw)

    2) Register the two schemas :

    begin
      dbms_xmlschema.registerSchema(
        schemaURL       => 'pacs.002.001.03S2.xsd'
      , schemaDoc       => xmltype(bfilename('TEST_DIR','pacs.002.001.03S2.xsd'), nls_charset_id('AL32UTF8'))
      , local           => true
      , genTypes        => false
      , genTables       => false
      , enableHierarchy => dbms_xmlschema.ENABLE_HIERARCHY_NONE
      , options         => dbms_xmlschema.REGISTER_BINARYXML
      );
    end;
    /
    
    begin
      dbms_xmlschema.registerSchema(
        schemaURL       => 'SCTCvfBlkCredTrf.xsd'
      , schemaDoc       => xmltype(bfilename('TEST_DIR','SCTCvfBlkCredTrf.xsd'), nls_charset_id('AL32UTF8'))
      , local           => true
      , genTypes        => false
      , genTables       => false
      , enableHierarchy => dbms_xmlschema.ENABLE_HIERARCHY_NONE
      , options         => dbms_xmlschema.REGISTER_BINARYXML
      );
    end;
    /
    
    
    

    3) The following query builds all node paths defined by the main schema (SCTCvfBlkCredTrf.xsd), along with the namespaces :

    with schema_list (schema_url, schema_owner) as (
        select 'SCTCvfBlkCredTrf.xsd', 'OTN' from dual
        union all
        select sd.dep_schema_url, sd.dep_schema_owner
        from schema_list sl
             join dba_xml_schema_dependency sd on sd.schema_url = sl.schema_url
                                              and sd.schema_owner = sl.schema_owner
    ),
    namespace_mapping (target_namespace, prefix) as (
        select target_namespace
             , 'ns' || row_number() over(order by target_namespace) as prefix
        from user_xml_schema_namespaces
        where schema_url in ( select schema_url from schema_list )
    ),
    schema_nodes (
        is_attr
    ,   node_name
    ,   max_occurs
    ,   element_id
    ,   parent_element_id
    ,   target_namespace
    ,   schema_url
    ) as ( 
        select 0
             , element_name as node_name
             , xmlcast(
                 xmlquery('/xs:element/@maxOccurs' passing element returning content) 
                 as varchar2(10)
               )
             , element_id
             , parent_element_id
             , target_namespace
             , schema_url
        from user_xml_schema_elements
        where schema_url in ( select schema_url from schema_list )
        union all
        select 1
             , attribute_name as node_name
             , null
             , null
             , element_id
             , target_namespace
             , schema_url
        from user_xml_schema_attributes
        where schema_url in ( select schema_url from schema_list )
    )
    select level
         , sn.node_name
         , sn.max_occurs
         , connect_by_isleaf as is_leaf
         , sys_connect_by_path(
             case when ns.prefix is not null and is_attr = 0 then ns.prefix || ':' end ||
             case when sn.is_attr = 1 then '@' end ||          
             sn.node_name
           , '/'
           ) as xpath
         --, target_namespace
    from schema_nodes sn
         left outer join namespace_mapping ns on ns.target_namespace = sn.target_namespace
    connect by prior sn.element_id = sn.parent_element_id
    start with sn.schema_url = 'SCTCvfBlkCredTrf.xsd'
           and sn.parent_element_id is null
    ;
    
    LEVEL NODE_NAME          MAX_OCCURS IS_LEAF XPATH
    ----- ------------------ ---------- ------- ----------------------------------------------------------------------------------------------------------
        1 SCTCvfBlkCredTrf                    0 /ns2:SCTCvfBlkCredTrf
        2 SndgInst                            1 /ns2:SCTCvfBlkCredTrf/ns2:SndgInst
        2 RcvgInst                            1 /ns2:SCTCvfBlkCredTrf/ns2:RcvgInst
        2 SrvcId                              1 /ns2:SCTCvfBlkCredTrf/ns2:SrvcId
        2 TstCode                             1 /ns2:SCTCvfBlkCredTrf/ns2:TstCode
        2 FType                               1 /ns2:SCTCvfBlkCredTrf/ns2:FType
        2 FileRef                             1 /ns2:SCTCvfBlkCredTrf/ns2:FileRef
        2 FileDtTm                            1 /ns2:SCTCvfBlkCredTrf/ns2:FileDtTm
        2 OrigFRef                            1 /ns2:SCTCvfBlkCredTrf/ns2:OrigFRef
        2 OrigFName                           1 /ns2:SCTCvfBlkCredTrf/ns2:OrigFName
        2 OrigDtTm                            1 /ns2:SCTCvfBlkCredTrf/ns2:OrigDtTm
        2 FileRjctRsn                         1 /ns2:SCTCvfBlkCredTrf/ns2:FileRjctRsn
        2 FileBusDt                           1 /ns2:SCTCvfBlkCredTrf/ns2:FileBusDt
        2 FileCycleNo                         1 /ns2:SCTCvfBlkCredTrf/ns2:FileCycleNo
        2 FIToFIPmtStsRptS2  1                0 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2
        3 GrpHdr                              0 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:GrpHdr
        4 MsgId                               1 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:GrpHdr/ns1:MsgId
        4 CreDtTm                             1 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:GrpHdr/ns1:CreDtTm
        4 InstgAgt                            0 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:GrpHdr/ns1:InstgAgt
        5 FinInstnId                          0 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:GrpHdr/ns1:InstgAgt/ns1:FinInstnId
        6 BIC                                 1 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:GrpHdr/ns1:InstgAgt/ns1:FinInstnId/ns1:BIC
        3 OrgnlGrpInfAndSts                   0 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:OrgnlGrpInfAndSts
        4 OrgnlMsgId                          1 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:OrgnlGrpInfAndSts/ns1:OrgnlMsgId
        4 OrgnlMsgNmId                        1 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:OrgnlGrpInfAndSts/ns1:OrgnlMsgNmId
        4 OrgnlNbOfTxs                        1 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:OrgnlGrpInfAndSts/ns1:OrgnlNbOfTxs
        4 OrgnlCtrlSum                        1 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:OrgnlGrpInfAndSts/ns1:OrgnlCtrlSum
        4 GrpSts                              1 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:OrgnlGrpInfAndSts/ns1:GrpSts
        4 StsRsnInf                           0 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:OrgnlGrpInfAndSts/ns1:StsRsnInf
        5 Orgtr                               0 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:OrgnlGrpInfAndSts/ns1:StsRsnInf/ns1:Orgtr
        6 Id                                  0 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:OrgnlGrpInfAndSts/ns1:StsRsnInf/ns1:Orgtr/ns1:Id
        7 OrgId                               0 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:OrgnlGrpInfAndSts/ns1:StsRsnInf/ns1:Orgtr/ns1:Id/ns1:OrgId
        8 BICOrBEI                            1 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:OrgnlGrpInfAndSts/ns1:StsRsnInf/ns1:Orgtr/ns1:Id/ns1:OrgId/ns1:BICOrBEI
        5 Rsn                                 0 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:OrgnlGrpInfAndSts/ns1:StsRsnInf/ns1:Rsn
        6 Cd                                  1 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:OrgnlGrpInfAndSts/ns1:StsRsnInf/ns1:Rsn/ns1:Cd
        6 Prtry                               1 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:OrgnlGrpInfAndSts/ns1:StsRsnInf/ns1:Rsn/ns1:Prtry
        4 NbOfTxsPerSts      2                0 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:OrgnlGrpInfAndSts/ns1:NbOfTxsPerSts
        5 DtldNbOfTxs                         1 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:OrgnlGrpInfAndSts/ns1:NbOfTxsPerSts/ns1:DtldNbOfTxs
        5 DtldSts                             1 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:OrgnlGrpInfAndSts/ns1:NbOfTxsPerSts/ns1:DtldSts
        5 DtldCtrlSum                         1 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:OrgnlGrpInfAndSts/ns1:NbOfTxsPerSts/ns1:DtldCtrlSum
        3 TxInfAndSts        unbounded        0 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:TxInfAndSts
        4 StsId                               1 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:TxInfAndSts/ns1:StsId
        4 OrgnlInstrId                        1 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:TxInfAndSts/ns1:OrgnlInstrId
        4 OrgnlEndToEndId                     1 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:TxInfAndSts/ns1:OrgnlEndToEndId
        4 OrgnlTxId                           1 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:TxInfAndSts/ns1:OrgnlTxId
        4 TxSts                               1 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:TxInfAndSts/ns1:TxSts
        4 StsRsnInf                           0 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:TxInfAndSts/ns1:StsRsnInf
        5 Orgtr                               0 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:TxInfAndSts/ns1:StsRsnInf/ns1:Orgtr
        6 Id                                  0 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:TxInfAndSts/ns1:StsRsnInf/ns1:Orgtr/ns1:Id
        7 OrgId                               0 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:TxInfAndSts/ns1:StsRsnInf/ns1:Orgtr/ns1:Id/ns1:OrgId
        8 BICOrBEI                            1 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:TxInfAndSts/ns1:StsRsnInf/ns1:Orgtr/ns1:Id/ns1:OrgId/ns1:BICOrBEI
        5 Rsn                                 0 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:TxInfAndSts/ns1:StsRsnInf/ns1:Rsn
        6 Cd                                  1 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:TxInfAndSts/ns1:StsRsnInf/ns1:Rsn/ns1:Cd
        6 Prtry                               1 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:TxInfAndSts/ns1:StsRsnInf/ns1:Rsn/ns1:Prtry
        4 InstdAgt                            0 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:TxInfAndSts/ns1:InstdAgt
        5 FinInstnId                          0 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:TxInfAndSts/ns1:InstdAgt/ns1:FinInstnId
        6 BIC                                 1 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:TxInfAndSts/ns1:InstdAgt/ns1:FinInstnId/ns1:BIC
        4 OrgnlTxRef                          0 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:TxInfAndSts/ns1:OrgnlTxRef
        5 IntrBkSttlmAmt                      0 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:TxInfAndSts/ns1:OrgnlTxRef/ns1:IntrBkSttlmAmt
        6 Ccy                                 1 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:TxInfAndSts/ns1:OrgnlTxRef/ns1:IntrBkSttlmAmt/@Ccy
        5 IntrBkSttlmDt                       1 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:TxInfAndSts/ns1:OrgnlTxRef/ns1:IntrBkSttlmDt
        5 DbtrAgt                             0 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:TxInfAndSts/ns1:OrgnlTxRef/ns1:DbtrAgt
        6 FinInstnId                          0 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:TxInfAndSts/ns1:OrgnlTxRef/ns1:DbtrAgt/ns1:FinInstnId
        7 BIC                                 1 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:TxInfAndSts/ns1:OrgnlTxRef/ns1:DbtrAgt/ns1:FinInstnId/ns1:BIC
        5 CdtrAgt                             0 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:TxInfAndSts/ns1:OrgnlTxRef/ns1:CdtrAgt
        6 FinInstnId                          0 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:TxInfAndSts/ns1:OrgnlTxRef/ns1:CdtrAgt/ns1:FinInstnId
        7 BIC                                 1 /ns2:SCTCvfBlkCredTrf/ns2:FIToFIPmtStsRptS2/ns1:TxInfAndSts/ns1:OrgnlTxRef/ns1:CdtrAgt/ns1:FinInstnId/ns1:BIC

    Actually, you would typically extract the namespace mapping data in a separate query so that you can use it to build the dynamic XMLNamespaces clause.

    NB1 : Also note that not all paths are needed, only those whose IS_LEAF = 1 or MAX_OCCURS = 'unbounded' or MAX_OCCURS > '1', so you may want to add a filter predicate in the query directly.

    NB2 : The query doesn't handle every feature of XSD, for example I've relied on the default behaviour for the 'element_form_default' and 'attribute_form_default' attributes.

    BluShadow
«1

Answers

  • odie_63
    odie_63 Member Posts: 8,491 Silver Trophy
    edited Nov 1, 2013 4:06PM

    Hi,

    1) What's your database version ?

    2) What's the practical use of extracting all the paths like this, what are you trying to achieve, besides a purely academic exercise ?

    EXTRACT function possesses a third parameter to handle namespace mappings provided you know them at runtime, but apparently you want to extract namespace information as well, right? Or do you just need node names regardless of the namespace they belong to ?

    Your expected output is not consistent with the input XML document :

    6~/person/homecity/lat:lat/

    but the lat element is not namespace-qualified in the input doc.

    Could you clarify this point?

    Meanwhile, you may find this a little more simpler to get the same result :

    SQL> var doc clob
    SQL>
    SQL> begin
      2   :doc := '<person xmlns="urn:person" xmlns:lat="urn:lat">
      3  <name>Rob</name>
      4  <age>37</age>
      5  <homecity>
      6      <name x="1">London</name>
      7      <lat:lat>123.000</lat:lat>
      8      <long>0.00</long>
      9  </homecity>
     10  </person>';
     11  end;
     12  /
    
    PL/SQL procedure successfully completed.
    
    SQL> select x.*
      2  from xmltable(
      3       'declare function local:getXPath($e as node(), $xp as xs:string) as xs:string*
      4        {
      5          for $i in $e/(*|@*)
      6          let $xp2 := concat(
      7                        $xp
      8                      , "/"
      9                      , if ($i instance of attribute()) then "@ else ()
     10                      , name($i)
     11                      )
     12          return ($xp2, local:getXPath($i, $xp2))
     13        }; (: :)
     14        /", local:getXPath($d, "")'
     15        passing xmlparse(document :doc) as "d"
     16        columns xpath varchar2(4000) path '.'
     17      ) x
     18  ;
    
    XPATH
    --------------------------------------------------------------------------------
    /
    /person
    /person/name
    /person/age
    /person/homecity
    /person/homecity/name
    /person/homecity/name/@x
    /person/homecity/lat:lat
    /person/homecity/long
    
    9 rows selected.
    
    

    NB1 : I've added support for attributes

    NB2 : add the DISTINCT keyword as necessary

  • Hi,

    Thanks for the reply.

    Please find the details :

    1) What's your database version ?

    Database version  :

    Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production

    2) What's the practical use of extracting all the paths like this, what are you trying to achieve, besides a purely academic exercise ?

    The XML provided was for sample XML.

    The practical use being ,I wanted to parse the SEPA messages using the below code snippet dynamically.

    create table data_table (

         data xmltype

       )

        xmltype column data store as securefile binary xml

        ;

    insert into data_table values (

        xmltype('<?xml version="1.0" encoding="UTF-8"?>

    <SCTScfBlkCredTrf xmlns="urn:S2SCTScf:xsd:$SCTScfBlkCredTrf">

      <SndgInst>CMCIFRPPXXX</SndgInst>

      <RcvgInst>RLBBAT2E083</RcvgInst>

      <FileRef>006FRID2516PH712</FileRef>

      <SrvcId>SCT</SrvcId>

      <TstCode>T</TstCode>

      <FType>SCF</FType>

      <FDtTm>2012-11-01T01:12:22</FDtTm>

      <NumCTBlk>1</NumCTBlk>

      <NumPCRBlk>0</NumPCRBlk>

      <NumRFRBlk>0</NumRFRBlk>

      <NumROIBlk>0</NumROIBlk>

    <FIToFICstmrCdtTrf xmlns="urn:iso:std:iso:20022:tech:xsd:sct:pacs.008.001.02">

      <GrpHdr>

        <MsgId>006MSID12511PH712</MsgId>

        <CreDtTm>2012-11-01T08:09:14</CreDtTm>

        <NbOfTxs>1</NbOfTxs>

        <TtlIntrBkSttlmAmt Ccy="EUR">20000</TtlIntrBkSttlmAmt>

        <IntrBkSttlmDt>2012-11-01</IntrBkSttlmDt>

        <SttlmInf>

          <SttlmMtd>INGA</SttlmMtd>

          <ClrSys>

            <Prtry></Prtry>

          </ClrSys>

        </SttlmInf>

        <InstgAgt>

          <FinInstnId>

            <BIC>RLBBAT2E083</BIC>

          </FinInstnId>

        </InstgAgt>

        <InstdAgt>

          <FinInstnId>

            <BIC>HELADEF1XXX</BIC>

          </FinInstnId>

        </InstdAgt>

      </GrpHdr>

      <CdtTrfTxInf>

        <PmtId>

          <EndToEndId>006SEOP23END1712</EndToEndId>

          <TxId>006SEOP231PH1712</TxId>

        </PmtId>

        <PmtTpInf>

          <SvcLvl>

            <Cd>SEPA</Cd>

          </SvcLvl>

        </PmtTpInf>

        <IntrBkSttlmAmt Ccy="EUR">20000</IntrBkSttlmAmt>

        <AccptncDtTm>2012-11-01T12:00:00</AccptncDtTm>

        <ChrgBr>SLEV</ChrgBr>

        <Dbtr>

          <Nm>Mrinmoy Sahu</Nm>

          <PstlAdr>

            <Ctry>FR</Ctry>     

            <AdrLine>6</AdrLine>

            <AdrLine>Bangalore</AdrLine>

          </PstlAdr>

          <Id>

            <PrvtId>

              <Othr>

                <Id>E20809</Id>

                <SchmeNm></SchmeNm>

                <Issr>ORACLE CORP</Issr>

              </Othr>

            </PrvtId>

          </Id>

        </Dbtr>

        <DbtrAcct>

          <Id>

            <IBAN>FR7030087330086000000000591</IBAN>

          </Id>

        </DbtrAcct>

        <DbtrAgt>

          <FinInstnId>

            <BIC>CMCIFRPPXXX</BIC>

          </FinInstnId>

        </DbtrAgt>

        <CdtrAgt>

          <FinInstnId>

            <BIC>RLBBAT2E083</BIC>

          </FinInstnId>

        </CdtrAgt>

        <Cdtr>

          <Nm>Mrinmoy Sahu</Nm>

          <PstlAdr>

            <Ctry>FR</Ctry>     

            <AdrLine>22 </AdrLine>

          </PstlAdr>

          <Id>

            <PrvtId>

              <Othr>

                <Id>F676869</Id>

                <SchmeNm></SchmeNm>

                <Issr>GOVT OF INDIA</Issr>

              </Othr>

            </PrvtId>

          </Id>

        </Cdtr>

        <CdtrAcct>

          <Id>

            <IBAN>FR7630087330086000000004266</IBAN>

          </Id>

        </CdtrAcct>

        <RmtInf>

            <Ustrd>abc</Ustrd>

          </RmtInf>

      </CdtTrfTxInf>   

      </FIToFICstmrCdtTrf>

    </SCTScfBlkCredTrf>

    ')

       );

    alter session set nls_numeric_characters = ".,";

          SELECT Msgid, Msgid1, Sttlmmtd

       FROM   data_table t,Xmltable(Xmlnamespaces('urn:S2SCTScf:xsd:$SCTScfBlkCredTrf' AS "A", 'urn:iso:std:iso:20022:tech:xsd:sct:pacs.008.001.02' AS "B")

                        ,'/A:SCTScfBlkCredTrf/B:FIToFICstmrCdtTrf' Passing t.data Columns

                        Msgid Path 'A:SndgInst'

                        ,Msgid1 Path 'B:GrpHdr/B:MsgId'

                        ,Sttlmmtd Path 'B:GrpHdr/B:SttlmInf/B:SttlmMtd');

                       

                       

    MSGID             ;MSGID1           ;STTLMMTD         ;

    -----------------;-----------------;-----------------;

                     ;006MSID12511PH712;INGA             ;

    The idea was to :

    1). Map the Column Names for the XPath built using the previous code.

    2). Build the Select statement shown above dynamically based on the Xpath built in the previous code using a PLSQL block, and to process the data by doing a bulk collect on the XML.

    The above XML may contain multiple <FIToFICstmrCdtTrf></FIToFICstmrCdtTrf> nodes.

    Since the name space in the XML varies for each version of the SEPA messages , i need to get the names spaces ALSO to be picked up dynamically and parse the XML data based on the select statement as shown above.

    Could you please guide me with

    1). Is the approach taken appropriate?

    2). Any alternative approach should be looked for ?

    3). How to extract the name spaces available in the XML?

    Thanks.

  • odie_63
    odie_63 Member Posts: 8,491 Silver Trophy

    Are namespaces the only moving parts ? or are you also expecting the paths to query to change as well ?

  • Whenever "ISO 20022, a multi part International Standard prepared by ISO Technical Committee" , do some changes in the higher version of the Messages, The Name Space will get changed from "xmlns="urn:iso:std:iso:20022:tech:xsd:sct:pacs.008.001.02" to "xmlns="urn:iso:std:iso:20022:tech:xsd:sct:pacs.008.001.03" for example and correspondingly some new tags may get added.

    Until the ISO 20022 doesn't do any changes to the messages, we will be using the latest version as our base.

    But when there is a change in the XML message, we should be able to handle the changes by rebuilding the Complete path again and reassigning the Column names for the new Tags added.

  • odie_63
    odie_63 Member Posts: 8,491 Silver Trophy
    edited Nov 6, 2013 1:07PM
    But when there is a change in the XML message, we should be able to handle the changes by rebuilding the Complete path again and reassigning the Column names for the new Tags added.
    
    

    OK, then how will you know which elements to keep? How will you know their meanings?

    More importantly, if you're extracting XML data as relational columns that means you have a hardcoded holding structure defined somewhere, and if new elements are coming, you have to evolve that structure.

    It's not yet very clear to me how you're processing the data, or how you're expecting to handle structural change without evolving the program, but this more and more sounds like a job for the DOM API, not relational tool like XMLTable.

  • Hi,

    Let me explain the Complete thought process:

    Goal : To parse the XML and insert the record into the Table.

    To achieve the above goal, we have split the activity into two step process.

    Step 1 : Parse the XML to build the XPath, and provide the built XPath in an UI, where the Corresponding Column of a Table for a built XPath will be hardcoded.

    The above activity is a one time activity.

    Step 2 : Based on the provided Data, we will build the select query and extract the data when ever there is a Upload of an XML done.

    Step 1 activity will be repeated when ever the ISO 20022 organization publish with the latest version, in which case the Name space and the tags may get changed/added.

    In this scenario we will have a check of the existing XPath, and ignore those paths and show only the paths which are not available in the database which was built in the previous Step 1 for the older version in the same UI.

    Then map the new XPath to the new Columns as required and save the records.

    Then the Step 2 will remain the same as we will build the SQL select statement to select the record dynamically.

    Thanks.

  • odie_63
    odie_63 Member Posts: 8,491 Silver Trophy

    Thanks, that clears things up a little bit.

    How does the UI handle the mapping for repeating nodes ? At some point, it has to declare which nodes represent logical rows, and which are column values.

    I confirm this is more like a job for the DOM API, more precisely DBMS_XSLPROCESSOR that possesses wrapper procedures and functions to easily walk sequences of repeating nodes and eventually extract leaf values based on their XPaths.

  • BluShadow
    BluShadow Member, Moderator Posts: 41,978 Red Diamond

    Sounds similar to an activity we had to build an ETL (or more like ETLE) product based on a supplied XML Schema.  However we have no control of the XML Schema and it may change in the future (in the early days of the project it was changing quite frequently, but now it has stabilized for the time being).

    In that case I wrote an package that read the XML schema as an XML DOM Document (picking it up from the XML DB WebDav area).  This then parsed the XML Schema, including all the included schemas (yes they didn't make it easy for us) by recursing down all the DOM nodes in the document and storing that information in a standard relational table.  That information is then post-processed to 'flatten' out the schema in terms of the simple types etc. so that we ended up with metadata from the schema that a second package uses to automatically generate the staging tables for our ETL product, as well as a package that can query the data from the staging tables and generate XML in the same structure as defined by the schema. On top of that it generated a skeleton package for mapping data into the staging tables, so all we are then left with to manually code is the completion of the skeleton package to obtain the source data and put it in the right staging tables (the package handles all the primary and secondary keys so it really is a case of obtain the data and mapping it to the right place).

    When a new XML schema is supplied we can perform the same exercise in another schema to generate the staging tables etc. and then do a database comparison on those staging tables with the old ones to get information about what is new, what has changed, and what has been removed, so that we can then evaluate how much work is required to re-do the mapping.  So, within 1 day of receiving a new schema, we can create a report of the changes and give our management a rough estimate of how much work will be involved, as well as highlight any potential issues.

    When you're being supplied with an unknown structure, you cannot easily write stuff using standard XML functionality and just expect it to be able to handle that new structure.  In our case the writing of the XML Schema parser using the DBMS_XMLDOM package, was key to the success of our project, ensuring things weren't missed and that the XML produced as output at the end was pretty much guaranteed to be of the right structure.  Other suppliers with the same task as us, went about it different ways, some using 3rd party ETL tools etc. but they've had a lot harder job getting it right and a lot of manual work to do; and some of the other suppliers were given our solution as their starting point for their own source data, and have had good success with it.

    Your task sounds like you want to do something similar, parsing the XML document to generate the paths and code around it.

  • Hi,

    The UI provides the option to capture the Column names manually, and if there are repeating values like Address, the UI would value the Indexing option like

    Assume the following data has been repeated twice

    /person/homecity/address

    /person/homecity/address


    which are basically address1 and address2 columns in our tables,

    we would internally index the data as shown below


    /person/homecity/address[1]
    /person/homecity/address[2]


    So with this changes, our dynamic SQL build would be able to extract the data from the select statement.


    This mapping of the column names to the XPath will be a one time activity done by us only, and we will be knowing what columns should be mapped to what XPath and finally save the records into the data base.

    Not all users will be having the access to map the columns to the Xpath. Usually it will be the developers work.


    Thanks.

  • odie_63
    odie_63 Member Posts: 8,491 Silver Trophy
    edited Nov 7, 2013 6:04AM
    Assume the following data has been repeated twice
    
    /person/homecity/address
    /person/homecity/address
    
    which are basically address1 and address2 columns in our tables,

    That's not the situation I had in mind.

    I'll rephrase the question : what about unbounded sequence of elements, mapped to rows in your target table ?


This discussion has been closed.