10 Replies Latest reply: Nov 23, 2012 7:21 AM by odie_63 RSS

    "Edge table"

    jhoward
      Is there an easy to implement way to get a schema-independent representation of the content stored in an XMLTYPE column? Rather than building XMLTABLEs & manually specifying PATH a thousand times, I'd like an easy way to free the content from the hierarchy of xml and get it into a key,value-ish representation as quickly and painlessly as possible.

      Kinda like this?

      http://www.informit.com/articles/article.aspx?p=169590&seqNum=5
      Schema-Independent Representation
      
      One possibility for handling the vagaries of full XML is to use a relational representation 
      that is totally independent of the schema of the XML data. An example of such a representation 
      is an edge table, in which a single table contains information about the relationship between
       parent and child elements (using node identifiers), and another column contains simple
       element content. Listing 6.14 provides an example of an XML fragment and an edge-table
       representation of it.
      Listing 6.14 An Edge Table for an XML Fragment
      
      <Department id="A12x" name="Physics">
        <Employee id="555-23-4567">
          <name>Mary Phillips</name>
          <office>21</office>
        </Employee>
        <Employee id="544-12-3456">
          ...
      
      edge table
      parentid  childid  name         value
      _____________________________________________________________________
      null      1        Department   null
      1         2        @id          A12x
      1         3        @name        Physics
      1         4        Employee     null
      4         5        @id          555-23-4567
      4         6        name         Mary Phillips
      4         7        office       21
      1         8        Employee     null
      8         9        @id          544-12-3456
        • 1. Re: "Edge table"
          odie_63
          Some examples here :
          http://odieweblog.wordpress.com/2011/12/13/how-to-flatten-out-an-xml-hierarchical-structure/
          • 2. Re: "Edge table"
            Marco Gralike
            Thanks for the link by the way...
            I'd like an easy way to free the content from the hierarchy of xml and get it into a key,value-ish representation as quickly and painlessly as possible.
            I guess a lot of people would like the answer to that one to be able to write efficient XML parsers. What is your goal in the end?

            For more background info also see this ppt http://www.dbis.ethz.ch/research/publications/50.ppt
            • 3. Re: "Edge table"
              jhoward
              Thanks for the link Odie! Very nice! On "3. The XSLT way… with a bit of XQuery", would it be hard to add the full XPATH as an additional column?
              • 4. Re: "Edge table"
                jhoward
                Actually, I think I can use CONNECT BY to build that list once the XML's dumped into a table to build those xpaths. Thanks again
                • 5. Re: "Edge table"
                  odie_63
                  Actually, I think I can use CONNECT BY to build that list once the XML's dumped into a table to build those xpaths.
                  That's one way, indeed.
                  If you want the positional predicates too, you'll need an additional step, for example :
                  SQL> SELECT node_id
                    2       , parent_node_id
                    3       , node_name
                    4       , node_value
                    5       , sys_connect_by_path(xpath_step, '/') as xpath
                    6  FROM (
                    7    SELECT x.node_id
                    8         , x.parent_node_id
                    9         , x.node_name
                   10         , x.node_value
                   11         , case x.node_kind
                   12             when 'attribute' then '@' || x.node_name
                   13             when 'element' then x.node_name || '[' ||
                   14                                 to_char( row_number() over( partition by x.parent_node_id
                   15                                                                        , x.node_name
                   16                                                             order by x.node_position ) ) || ']'
                   17           end as xpath_step
                   18         , x.node_position
                   19    FROM xml_test t
                   20       , XMLTable('/ROWSET/ROW'
                   21          passing xmltransform(t.doc, t.xsl)
                   22          columns node_id         varchar2(100)   path '@id'
                   23                , node_name       varchar2(30)    path '@name'
                   24                , node_value      varchar2(2000)  path 'text()'
                   25                , parent_node_id  varchar2(100)   path '@pid'
                   26                , node_kind       varchar2(30)    path '@kind'
                   27                , node_position   for ordinality
                   28         ) x
                   29  )
                   30  CONNECT BY PRIOR node_id = parent_node_id
                   31  START WITH parent_node_id IS NULL
                   32  ORDER SIBLINGS BY node_position
                   33  ;
                   
                  NODE_ID    PARENT_NOD NODE_NAME                      NODE_VALUE                               XPATH
                  ---------- ---------- ------------------------------ ---------------------------------------- --------------------------------------------------------------------------------
                  K1                    Product                                                                 /Product[1]
                  K2         K1         Type                           Laptop                                   /Product[1]/@Type
                  K3         K1         Notebook                                                                /Product[1]/Notebook[1]
                  K4         K3         Brand                          HP                                       /Product[1]/Notebook[1]/@Brand
                  K5         K3         Model                          Pavilion dv6-3132TX Notebook             /Product[1]/Notebook[1]/@Model
                  K6         K3         Harddisk                       640 GB                                   /Product[1]/Notebook[1]/Harddisk[1]
                  K8         K3         Processor                      Intel Core i7                            /Product[1]/Notebook[1]/Processor[1]
                  K10        K3         RAM                            4 GB                                     /Product[1]/Notebook[1]/RAM[1]
                  K12        K1         Notebook                                                                /Product[1]/Notebook[2]
                  K13        K12        Brand                          HP                                       /Product[1]/Notebook[2]/@Brand
                  K14        K12        Model                          HP Pavilion dv6-3032TX Notebook          /Product[1]/Notebook[2]/@Model
                  K15        K12        Harddisk                       640 GB                                   /Product[1]/Notebook[2]/Harddisk[1]
                  K17        K12        Processor                      Intel Core i7                            /Product[1]/Notebook[2]/Processor[1]
                  K19        K12        RAM                            6 GB                                     /Product[1]/Notebook[2]/RAM[1]
                  K21        K1         Notebook                                                                /Product[1]/Notebook[3]
                  K22        K21        Brand                          Toshiba                                  /Product[1]/Notebook[3]/@Brand
                  K23        K21        Model                          Satellite A660/07R 3D Notebook           /Product[1]/Notebook[3]/@Model
                  K24        K21        Harddisk                       640 GB                                   /Product[1]/Notebook[3]/Harddisk[1]
                  K26        K21        Processor                      Intel Core i7                            /Product[1]/Notebook[3]/Processor[1]
                  K28        K21        RAM                            4 GB                                     /Product[1]/Notebook[3]/RAM[1]
                  K30        K1         Notebook                                                                /Product[1]/Notebook[4]
                  K31        K30        Brand                          Toshiba                                  /Product[1]/Notebook[4]/@Brand
                  K32        K30        Model                          Satellite A660/15J Notebook              /Product[1]/Notebook[4]/@Model
                  K33        K30        Harddisk                       640 GB                                   /Product[1]/Notebook[4]/Harddisk[1]
                  K35        K30        Processor                      Intel Core i5                            /Product[1]/Notebook[4]/Processor[1]
                  K37        K30        RAM                            6 GB                                     /Product[1]/Notebook[4]/RAM[1]
                   
                  26 rows selected
                   
                  • 6. Re: "Edge table"
                    odie_63
                    It can all be done in XSLT too :
                    <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
                      <xsl:output method="xml"/>
                      <xsl:template match="/">
                        <ROWSET>
                          <xsl:apply-templates select="*"/>
                        </ROWSET>
                      </xsl:template>
                      <xsl:template match="*">
                        <xsl:param name="pid"/>
                        <xsl:param name="pxp"/>
                        <xsl:variable name="id" select="generate-id(.)"/>
                        <xsl:variable name="name" select="name(.)"/>
                        <xsl:variable name="xp" select="concat($pxp,'/',$name,'[',count(preceding-sibling::*[name()=$name])+1,']')"/>
                        <ROW id="{$id}" pid="{$pid}" name="{$name}" kind="element" xpath="{$xp}">
                          <xsl:value-of select="text()"/>
                        </ROW>
                        <xsl:apply-templates select="*|@*">
                          <xsl:with-param name="pid">
                            <xsl:value-of select="$id"/>
                          </xsl:with-param>
                          <xsl:with-param name="pxp">
                            <xsl:value-of select="$xp"/>
                          </xsl:with-param>
                        </xsl:apply-templates>
                      </xsl:template>
                      <xsl:template match="@*">
                        <xsl:param name="pid"/>
                        <xsl:param name="pxp"/>
                        <ROW id="{generate-id(.)}" pid="{$pid}" name="{name(.)}" kind="attribute" xpath="{concat($pxp,'/@',name(.))}">
                          <xsl:value-of select="."/>
                        </ROW>
                      </xsl:template>
                    </xsl:stylesheet>
                    • 7. Re: "Edge table"
                      jhoward
                      Great stuff! Thanks so much
                      • 8. Re: "Edge table"
                        jhoward
                        .
                        • 9. Re: "Edge table"
                          jhoward
                          .
                          • 10. Re: "Edge table"
                            odie_63
                            odie_63 wrote:
                            Some examples here :
                            http://odieweblog.wordpress.com/2011/12/13/how-to-flatten-out-an-xml-hierarchical-structure/
                            And a new faster method :
                            http://odieweblog.wordpress.com/2012/11/18/xml-flattening-revisited-java-based-pipelined-function/