This content has been marked as final. Show 20 replies
What about using [url http://docs.oracle.com/cd/E11882_01/appdev.112/e25788/d_xmldom.htm#i1124847]DBMS_XMLDOM.GETXMLTYPE to convert l_doc into an XMLType variable so you can insert the data into a table in the DB. Then you can use the methods that odie_63 (Marc) is showing you to accomplish the task in what should be a reasonable time.
As I didn't see it specified, are you saying the
takes hours to completeis for the entire procedure when using XMLType or simply for the line
l_xmltype := XMLType(p_data);
If only for that line, I find it interesting that dbms_xmlparser.parseClob is able to speed through the conversion from CLOB to DOMDocument, but XMLType(p_data) is not from CLOB to XMLType. Seems something is going on in XMLType that should not be.
If I missed something in my reading, let me know Marc or Marco.
Wasn't sure - now I know :-)
l_xmltype := XMLType(p_data);
takes long with my huge xml
This conversion problem is very strange.
Have you tried investigating on the possible causes? Where the time is being spent, memory consumption etc.
Here's the other approach you could try :
In my opinion, the requirement clearly dictates to treat the document as relational data, so that it could be joined to the base table data.
A MERGE statement seems appropriate in this case, with a temporary table to hold the parsed data.
The test document contains 50,000 <Message> elements, and the CUSTOMERDATA table 25,000 rows that match half of the messages from the XML.
create global temporary table tmp_xml of xmltype; create global temporary table tmp_customerdata ( col01 varchar2(300) , col02 varchar2(300) , col03 varchar2(300) );
The update document is regenerated from scratch by joining the temp table and the updated base table.
The block runs in about 26s (could be further optimized with indexes).
DECLARE v_doc clob; v_new_doc clob; ts_start timestamp := systimestamp; -- a little trace proc :) procedure trace_time (mess in varchar2) is tmp timestamp := systimestamp; begin dbms_output.put_line(rpad(mess,30) ||' : '|| (tmp - ts_start)); ts_start := tmp; end; BEGIN select doc into v_doc from testclob; trace_time('CLOB fetch'); insert into tmp_xml values( xmlparse(document v_doc) ); trace_time('XMLType insert'); insert into tmp_customerdata (col01, col02, col03) select x.col01, x.col02, x.col03 from tmp_xml t , xmltable('/Root/Message' passing t.object_value columns col01 varchar2(300) path 'Customer' , col02 varchar2(300) path 'MessageType' , col03 varchar2(300) path 'Key' ) x ; trace_time('XML data parsing'); /* Merge XML data into CUSTOMERDATA table */ merge into customerdata t using ( select col01, col02, col03 from tmp_customerdata ) v on ( v.col01 = t.col01 and v.col02 = t.col02 and v.col03 = t.col03 and t.groupid = '10' ) when matched then update set t.col05 = t.col05 + 1 , t.col06 = to_char(sysdate, 'YYYY-MM-DD HH24:MI:SS') when not matched then insert (groupid, col01, col02, col03, col04, col05, col06) values ( '10' , v.col01 , v.col02 , v.col03 , sq_value.nextval , '1' , to_char(sysdate, 'YYYY-MM-DD HH24:MI:SS') ) ; trace_time('Data merging'); /* Generate the updated XML document */ select xmlserialize(document xmlelement("Root", xmlagg( xmlelement("Message", xmlforest( t.col01 as "Customer" , t.col02 as "MessageType" , t.col03 as "Key" , t.col04 as "Value" , t.col05 as "Hits" ) ) ) ) ) into v_new_doc from tmp_customerdata t join customerdata c on c.col01 = t.col01 and c.col02 = t.col02 and c.col03 = t.col03 and c.groupid = '10' ; trace_time('Generating new document'); END; /
CLOB fetch : +000000000 00:00:00.000000000 XMLType insert : +000000000 00:00:11.310000000 XML data parsing : +000000000 00:00:03.073000000 Data merging : +000000000 00:00:07.020000000 Generating new document : +000000000 00:00:04.368000000
Thanks odie_63 for suggesting this. It is the quickest of the options.
Have you tried investigating on the possible causes? Where the time is being spent, memory consumption etc.my test had just one line where I do a XMLType(CLOB), that took hours to complete.
The solution you have given in the last post is the quickest. Thanks. I'm armed with more knowledge about xml parsing in Oracle. Many Thanks.
after about 10 minutes into execution, I get
ERROR at line 1:
ORA-01652: unable to extend temp segment by 128 in tablespace TEMP
ORA-06512: at line 71
the tmp tablespace is 2GB