-
15. Re: ORA-31011: XML parsing failed. ORA-19213: error occurred in XML processing at lines xxx
chris227 Nov 17, 2016 3:24 PM (in response to user6029069)I guess that your are one the wrong way.
Think about it:
On one side you have this mystical hyphen you dont know from which characterset it comes from.
On the other side you have some output: –
What you see is that there are three characters. From that we can conclude that the original character took three bytes and from this we know that this coudnt be a 2-byte charactersset like WINDOWS-1252.
Let's verfify it:
select convert('–', 'WE8MSWIN1252', 'UTF8') c from dual;
C
"–"
So the xml encoding and the data contained do not match.
Therefore replacing dedicated characters might not be a robust solution.
BTW: what happens if you try:
XMLType(rpt, nls_charset_id('AL32UTF8'))
-
16. Re: ORA-31011: XML parsing failed. ORA-19213: error occurred in XML processing at lines xxx
odie_63 Nov 17, 2016 3:58 PM (in response to chris227)chris227 wrote:
What you see is that there are three characters. From that we can conclude that the original character took three bytes and from this we know that this coudnt be a 2-byte charactersset like WINDOWS-1252.
Exactly what I thought when I saw latest OP's "issue".
(except that WINDOWS-1252 is a single-byte charset... )
BTW: what happens if you try:
XMLType(rpt, nls_charset_id('AL32UTF8'))
Suggested that earlier too but no feedback at that time.
-
17. Re: ORA-31011: XML parsing failed. ORA-19213: error occurred in XML processing at lines xxx
user6029069 Nov 18, 2016 2:47 PM (in response to chris227)Thanks Chris, using nls_charset_id('AL32UTF8') failed too. What I did was to you my function that parse_text to replace characters that are not between chr(32) and chr(126) with a space and this seems to work. I am not sure it is the best solution but it worked.
-
18. Re: ORA-31011: XML parsing failed. ORA-19213: error occurred in XML processing at lines xxx
odie_63 Nov 18, 2016 3:01 PM (in response to user6029069)1 person found this helpfuluser6029069 wrote:
What I did was to you my function that parse_text to replace characters that are not between chr(32) and chr(126) with a space and this seems to work. I am not sure it is the best solution but it worked.
If you're happy with your solution, so be it
If you want to dig further and get to the bottom of this, we'll need more input from you :
For starters, let's make sure what that BLOB column really contain, in what encoding.
Could you post the output of this for your file :
select rawtohex(dbms_lob.substr(rpt, 2000))
from some_table1
where xml_file_id = ? ;
(It should be a long string of hex digits)
-
19. Re: ORA-31011: XML parsing failed. ORA-19213: error occurred in XML processing at lines xxx
user6029069 Nov 18, 2016 3:15 PM (in response to user6029069)This is the real data
3C3F786D6C2076657273696F6E3D22312E302220656E636F64696E673D2257494E444F57532D31323532223F3E0D0A3C212D2D2047656E657261746564206279204F7261636C65205265706F7274732076657273696F6E2031312E312E322E322E30202D2D3E0D0A3C494E564330353E0D0A20203C4C4953545F475F494E56433E0D0A202020203C475F494E56433E0D0A2020202020203C494E56435F4E4F3E31362D3230353C2F494E56435F4E4F3E0D0A2020202020203C494E56435F49443E343437303637363C2F494E56435F49443E0D0A2020202020203C5354415455533E3C2F5354415455533E0D0A2020202020203C42494C4C5F544F5F42495A454E545F4E414D453E3137373620454E45524759204F50455241544F5253204C4C433C2F42494C4C5F544F5F42495A454E545F4E414D453E0D0A2020202020203C42494C4C544F5F414444524553533E32303434352053544154452048494748574159203234390D0A53554954452034350D0A484F5553544F4E2C2054455841532037373037300D0A554E4954454420535441544553204F4620414D45524943413C2F42494C4C544F5F414444524553533E0D0A2020202020203C4C41594F55545F494E56435F4E4F3E31362D3230353C2F4C41594F55545F494E56435F4E4F3E0D0A2020202020203C42494C4C544F5F434F4E544143543E3C2F42494C4C544F5F434F4E544143543E0D0A2020202020203C494E56435F444F435F444154453E31352D4E4F562D31363C2F494E56435F444F435F444154453E0D0A2020202020203C414343545F5445524D5F444553433E436173683C2F414343545F5445524D5F444553433E0D0A2020202020203C454E445F555345525F4453545F434E5452595F43443E3C2F454E445F555345525F4453545F434E5452595F43443E0D0A2020202020203C444953435F444154453E31352D4E4F562D31363C2F444953435F444154453E0D0A2020202020203C444953435F50434E54313E303C2F444953435F50434E54313E0D0A2020202020203C5052455F5041595F494E56435F49443E3C2F5052455F5041595F494E56435F49443E0D0A2020202020203C5052455F5041595F494E56435F4E4F3E3C2F5052455F5041595F494E56435F4E4F3E0D0A2020202020203C5052455F5041595F494E56435F414D543E303C2F5052455F5041595F494E56435F414D543E0D0A2020202020203C414343545F504552494F443E30312D4645422D31343C2F414343545F504552494F443E0D0A2020202020203C434954593E3C2F434954593E0D0A2020202020203C4455455F444154453E31352D4E4F562D31363C2F4455455F444154453E0D0A2020202020203C434954595F5441585F50434E543E303C2F434954595F5441585F50434E543E0D0A2020202020203C434954595F5441585F524154453E303C2F434954595F5441585F524154453E0D0A2020202020203C434F554E54593E484F5553544F4E3C2F434F554E54593E0D0A2020202020203C434F554E54595F5441585F524154453E2E30313C2F434F554E54595F5441585F524154453E0D0A2020202020203C434F554E54595F5441585F50434E543E313C2F434F554E54595F5441585F50434E543E0D0A2020202020203C53544154455F50524F563E54455841533C2F53544154455F50524F563E0D0A2020202020203C53544154455F50524F565F5441585F524154453E2E303632353C2F53544154455F50524F565F5441585F524154453E0D0A2020202020203C53544154455F50524F565F5441585F50434E543E362E32353C2F53544154455F50524F565F5441585F50434E543E0D0A2020202020203C434F554E5452593E554E4954454420535441544553204F4620414D45524943413C2F434F554E5452593E0D0A2020202020203C434F554E5452595F5441585F524154453E2E30383C2F434F554E5452595F5441585F524154453E0D0A2020202020203C54584E5F4355524E43595F43443E5553443C2F54584E5F4355524E43595F43443E0D0A2020202020203C434F554E5452595F5441585F50434E543E383C2F434F554E5452595F5441585F50434E543E0D0A2020202020203C544F54414C5F5441585F50434E543E31352E32353C2F544F54414C5F5441585F50434E543E0D0A2020202020203C5441585F444953435F464C41473E4E3C2F5441585F444953435F464C41473E0D0A2020202020203C5441585F434F4D4D454E54313E3C2F5441585F434F4D4D454E54313E0D0A2020202020203C494E545F4F524445525F5245463E3C2F494E545F4F524445525F5245463E0D0A2020202020203C4558545F434F4D4D454E543E4F43544732303136204252454E444120444F524355532028534129204120333632304D53202D20534F31362D313238390D0A434F53542043454E544552209620333935343030350D0A0D0A544F2043524544495420594F5552204143434F554E5420464F522050524943452041444A5553544D454E54204F4E20494E564F4943452331362D3230353C2F4558545F434F4D4D454E543E0D0A2020202020203C494E56435F54595045313E523C2F494E56435F54595045313E0D0A2020202020203C52454D49545F414444523E507972616D696420547562756C61722050726F6475637473204C2E502E2C2032204E6F727468706F696E74204472697665205375697465203631302C20486F7573746F6E2C2054582037373036303C2F52454D49545F414444523E0D0A20202020
-
20. Re: ORA-31011: XML parsing failed. ORA-19213: error occurred in XML processing at lines xxx
odie_63 Nov 18, 2016 8:40 PM (in response to user6029069)Thanks.
With the data you've provided, I can confirm the XML is indeed encoded in WINDOWS-1252.
Here's the test case on a US7ASCII database -
I've extracted the relevant part of the binary data to reproduce the conversion to XMLType :
create table my_table (id integer, rpt blob);
insert into my_table
values (1, hextoraw(
'3C4558545F434F4D4D454E543E4F43544732303136204252454E' ||
'444120444F524355532028534129204120333632304D53202D20' ||
'534F31362D313238390D0A434F53542043454E54455220962033' ||
'3935343030350D0A0D0A544F2043524544495420594F55522041' ||
'43434F554E5420464F522050524943452041444A5553544D454E' ||
'54204F4E20494E564F4943452331362D3230353C2F4558545F43' ||
'4F4D4D454E543E'));
The highlighted byte in red represents the EN DASH character encoded in WINDOWS-1252.
Conversion to XMLType :
SQL> select xmlserialize(content
2 xmltype(rpt, nls_charset_id('WE8MSWIN1252'))
3 )
4 from my_table
5 where id = 1;
XMLSERIALIZE(CONTENTXMLTYPE(RP
--------------------------------------------------------------------------------
<EXT_COMMENT>OCTG2016 BRENDA DORCUS (SA) A 3620MS - SO16-1289
COST CENTER ? 3954005
TO CREDIT YOUR ACCOUNT FOR PRICE ADJUSTMENT ON INVOICE#16-205</EXT_COMMENT>
As expected, the dash was translated to a replacement character '?' because US-ASCII doesn't support it.
Now the $100 question :
How do you go from a replacement character back to this : – which happens to be exactly the character representation of E2 80 93, the UTF-8 byte sequence of the EN DASH character?
We're obviously missing some pieces of the puzzle here.
What process is there between the XMLType conversion and that CSV generation you're talking about?
-
21. Re: ORA-31011: XML parsing failed. ORA-19213: error occurred in XML processing at lines xxx
chris227 Nov 21, 2016 12:57 PM (in response to user6029069)According odie's question from above:
"Now the $100 question :
How do you go from a replacement character back to this : – which happens to be exactly the character representation of E2 80 93, the UTF-8 byte sequence of the EN DASH character?
We're obviously missing some pieces of the puzzle here."
i assume that the xml is generated from a WINDOWS-1252 System where UTF-8 data was stored without approbiate conversion.
In a oracle-DB varchar2 column for example a 3-byte UTF-8 characters may be stored as it is and is not cut off or something else.
Furhermore a client might display this faulty data correctly so the issue might occurs not before some more strict tools (regarding characters) come in play like xml-parser in this case.
-
22. Re: ORA-31011: XML parsing failed. ORA-19213: error occurred in XML processing at lines xxx
user6029069 Dec 1, 2016 5:14 AM (in response to chris227)My solution did not fly again with the management. I need your help on how to convert – chr(151) to chr(45).
I just need a simple replace function like
select Replace(some_string, Chr(151). chr(45)) from dual. I am getting a muitibyte error. I really need your help. Thanks
-
23. Re: ORA-31011: XML parsing failed. ORA-19213: error occurred in XML processing at lines xxx
odie_63 Dec 1, 2016 8:02 AM (in response to user6029069)user6029069 wrote:
My solution did not fly again with the management. I need your help on how to convert – chr(151) to chr(45).
I just need a simple replace function like
select Replace(some_string, Chr(151). chr(45)) from dual. I am getting a muitibyte error.
A simple replace is not the solution.
Could you please answer the two questions I asked earlier?
How do you go from a replacement character back to this : – which happens to be exactly the character representation of E2 80 93, the UTF-8 byte sequence of the EN DASH character?
We're obviously missing some pieces of the puzzle here.
What process is there between the XMLType conversion and that CSV generation you're talking about?
My second concern is your database character set.
You said earlier that it was US7ASCII and I've based my reasoning on that, but how could that be true if you have strings like '–' stored in your db.
Or maybe it's a NVARCHAR2 column?