My environment is Oracle DB 18.104.22.168.
I have some XML documents in a CLOB that have a three level hierarchie.
Cardinality for Level1:Level2 is 1:n and for Level2:Level3 also 1:n
The XML looks something like this:
I would like to convert this into a normalized relational table structure
This is in an incremental load DWH type of scenario, so I will have to do this initially for all previously existing records (approx. 200 Mio) and then daily for the additional records (approx 1Mio per day). I am using MERGE statements for the incremental loads.
Some of the documents may have up to 100 or more elements on level 2 with about 1-5 level 3 elements each.
What is the most effective way of doing this?
My first idea was to convert the CLOB into an XMLTYPE and then use 3 nested XMLTABLEs for parsing out the XPATH expressions. However, it feels like doing this, I am denormalizing everything (flattening it) and then regrouping into the 3 normalized levels...
Does it make sense to apply an XmlIndex to the XMLTYPE data? I am not sure whether the overhead for creating the index and adding the new records into it is really killing the gain from it on the parsing in the next step. Since this is a DWH incremental load situation, each record is only processed once, so I am thinking maybe not creating an XmlIndex might be faster.
I can see how an XmlIndex would be helpful for extracting very small fragments out of relatively large documents. But in this case, pretty much the whole document is relevant, which adds to my doubts about an index making sense on this one...
Thank you in advance...