10 Replies Latest reply: Feb 8, 2013 6:55 AM by user13117585 RSS

    String Aggregation Using XML

    user13117585
      Hello guys,

      I have a weird problem. Imagine the following scenario...
      CREATE TABLE my_body_test
      (
        id number(10), 
        rnk NUMBER(10),
        body VARCHAR2(1000)
      );
      
      INSERT INTO my_body_test VALUES(1, 1, 'Hello ');
      INSERT INTO my_body_test VALUES(1, 2, 'World [');
      INSERT INTO my_body_test VALUES(1, 2, CHR(12));
      INSERT INTO my_body_test VALUES(1, 2, ']!!');
      
      DROP TABLE my_body_test;
      I would like to aggregate the content of the column body. I can't une SYS_CONNECT_BY_PATH because the body contains a lot of data and can be more than 4000 chars. So, I do it this way:
      SELECT id, 
             dbms_xmlgen.convert(xmlagg(xmlelement(e, body) ORDER BY rnk ).extract('//text()').getClobVal(), 1)
        FROM my_body_test
       GROUP BY id;
      But I have the following error.
      ORA-31011: XML parsing failed
      31011. 00000 -  "XML parsing failed"
      *Cause:    XML parser returned an error while trying to parse the document.
      *Action:   Check if the document to be parsed is valid.
      I know it's because of the third row inserted [CHR(12)]. But is there any way to fix this?? I'm using 10g (LISTAGG is not an option).

      Thanks for your advice,
        • 1. Re: String Aggregation Using XML
          IvanBlanarik
          Hi,
          if I were you I would use customized STRAGG agregation function written by Tom Kyte (just change datatype from varchar2 to clob).
          It's quite easy solution:
          http://www.sqlsnippets.com/en/topic-11591.html
          • 2. Re: String Aggregation Using XML
            Solomon Yakobson
            CHR(12) is invalid caharacter for xml:
            SQL> SELECT id, 
              2         dbms_xmlgen.convert(xmlagg(xmlelement(e, body) ORDER BY rnk ).extract('//text()').getClobVal(), 1)
              3    FROM my_body_test
              4   GROUP BY id;
            ERROR:
            ORA-31061: XDB error: special char to escaped char conversion failed.
            
            
            
            no rows selected
            
            SQL> SELECT id, 
              2         dbms_xmlgen.convert(xmlagg(xmlelement(e, replace(body,chr(12),'*')) ORDER BY rnk ).extract('//text()').getClobVal(), 1)
              3    FROM my_body_test
              4   GROUP BY id;
            
                    ID DBMS_XMLGEN.CONVERT(XMLAGG(XMLELEMENT(E,REPLACE(BODY,CHR(12),'*'))ORDERBYRNK).EX
            ---------- --------------------------------------------------------------------------------
                     1 Hello World []!!*
            
            SQL> 
            SY.
            • 4. Re: String Aggregation Using XML
              odie_63
              Arild wrote:
              Perhaps this is a relevant thread: Getting "ORA-31011: XML parsing failed" using xmlAGG as rownum exceeds 1000
              Correct.

              Now it depends on the exact db version and patch level.

              Edited by: odie_63 on 7 févr. 2013 16:42
              • 5. Re: String Aggregation Using XML
                user13117585
                Thank you for your answers.
                But since I can not control the content of the body string (could even anything) how can I remove all invalid xml chars? My db version is
                BANNER                                                         
                ----------------------------------------------------------------
                Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bi 
                PL/SQL Release 10.2.0.5.0 - Production                           
                CORE     10.2.0.5.0     Production                                         
                TNS for Solaris: Version 10.2.0.5.0 - Production                 
                NLSRTL Version 10.2.0.5.0 - Production                           
                • 6. Re: String Aggregation Using XML
                  odie_63
                  Did you click on the link given by Arild?

                  Did you try the different solutions provided?

                  e.g. to replace invalid chars with '?'
                  alter session set events = '19119 trace name context forever, level 0x100000' ;
                  Connected to:
                  Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
                  With the Partitioning, OLAP, Data Mining and Real Application Testing options
                  
                  SQL> select xmlelement(e, chr(4)) from dual;
                  ERROR:
                  ORA-31011: Echec d'analyse XML
                  
                  
                  
                  no rows selected
                  
                  SQL> alter session set events = '19119 trace name context forever, level 0x100000' ;
                  
                  Session altered.
                  
                  SQL> select xmlelement(e, chr(4)) from dual;
                  
                  XMLELEMENT(E,CHR(4))
                  --------------------------------------------------------------------------------
                  <E>?</E>
                  And maybe a more sustainable approach would be to investigate on why you have such characters in the first place.
                  • 7. Re: String Aggregation Using XML
                    user13117585
                    Unfortunately, I can not ALTER SESSION (SQL Error: ORA-01031: insufficient privileges).

                    Why I have these characters? Well, it's easy. We store some emails and their attachments in this table not as clob but as varchar splitted every 1000 chars. And in the attachments, I can have word documents, pdf... anything actually. even pictures. So it's normal to have that in their. I use XML just to rebuild the original email...


                    Regards,
                    • 8. Re: String Aggregation Using XML
                      odie_63
                      user13117585 wrote:
                      Unfortunately, I can not ALTER SESSION (SQL Error: ORA-01031: insufficient privileges).
                      Ask your DBA to grant you the privilege :)
                      And in the attachments, I can have word documents, pdf... anything actually. even pictures. So it's normal to have that in their.
                      Indeed.
                      What is not normal is to store binary attachments into VARCHAR2 instead of BLOB.
                      I use XML just to rebuild the original email...
                      Don't use XML then. Those events will remove or alter the initial character occurences, so even though you manage to rebuild the original email, its content will be corrupted.

                      I'd look towards another CLOB aggregation technique, such as CLOBAGG function. Search for it in the forum, you'll find multiple posts giving its implementation.
                      • 9. Re: String Aggregation Using XML
                        IvanBlanarik
                        You could replace special characters with spaces:
                        SELECT id, 
                             dbms_xmlgen.convert(xmlagg(xmlelement(e, regexp_replace(body,'[[:cntrl:]]', ' ')) ORDER BY rnk ).extract('//text()').getClobVal(), 1)
                        FROM my_body_test
                        GROUP BY id;
                         
                        • 10. Re: String Aggregation Using XML
                          user13117585
                          odie_63 wrote:
                          user13117585 wrote:
                          Unfortunately, I can not ALTER SESSION (SQL Error: ORA-01031: insufficient privileges).
                          Ask your DBA to grant you the privilege :)
                          And in the attachments, I can have word documents, pdf... anything actually. even pictures. So it's normal to have that in their.
                          Indeed.
                          What is not normal is to store binary attachments into VARCHAR2 instead of BLOB.
                          I use XML just to rebuild the original email...
                          Don't use XML then. Those events will remove or alter the initial character occurences, so even though you manage to rebuild the original email, its content will be corrupted.

                          I'd look towards another CLOB aggregation technique, such as CLOBAGG function. Search for it in the forum, you'll find multiple posts giving its implementation.
                          I'm not the one who designed the system. I would have used another solution. For sure. CLOB or even storing the thing on a file server would have been easier. But since the solution has been designed 6 years ago... I can't change it today so easilly...


                          Thank you guys. I will try the solution Tom Kytes provides (CLOBAGG)