This discussion is archived
9 Replies Latest reply: Nov 15, 2010 9:59 PM by 812986 RSS

XML serialisation problem using Java Document obejct

812986 Newbie
Currently Being Moderated
Hi,

I am facing an issue while serialing a DOM object into physical XML file.

I have a Org.w3c.DOM document in memory which stores special characters such as “&” in the text node using Unicode value which is “&”
For example consider this xml file

<? xml version=”1.0”>
<test>
A &#x26; B
</test>

Now whenever I am converting this document object into a XML file on the physical location using transformer class, the output is shown as “&amp;#x26;” instead of “&#x26;” .

<? xml version=”1.0”>
<test>
A &amp;#x26; B
</test>

I am using transformer class for conversion.
Following is the small code snippet.
TransformerFactory tf = TransformerFactory.newInstance();
Transformer m = tf.newTransformer();
DOMSource source = new DOMSource(document);
StreamResult result = new StreamResult(new File(C:/test.xml);
m.transform(source, result);

Please help me resolve this issue.
  • 1. Re: XML serialisation problem using Java Document obejct
    812986 Newbie
    Currently Being Moderated
    Hi An update I just checked what been uploaded in my previous query that I have put, and its not the actually data. The issue will not be clear from reading the , I guess i have to write it in a different form


    I have a Org.w3c.DOM document in memory which stores special characters such as “&” in the text node using Unicode value which is “& # 26 ;”
    For example consider this xml file

    <? xml version=”1.0”>
    <test>
    A & # 26; B
    </test>

    Now whenever I am converting this document object into a XML file on the physical location using transformer class, the output is shown as “& amp; #x26;” instead of “&” .

    <? xml version=”1.0”>
    <test>
    A & amp ; #x26; B
    </test>

    The unicode value mentioned as “& # x 2 6;” .
    The error is mentoined as “& amp ; # x 26;”
    Please note i have wirtten it with spaces between characters to show the actual data as the data shown previously is not correct due to manipulation by this website it self.
  • 2. Re: XML serialisation problem using Java Document obejct
    jtahlborn Expert
    Currently Being Moderated
    that is a valid serialization of the xml. if you read that xml data using an xml parser, you should get back the data you wrote. what exactly is the problem?
  • 3. Re: XML serialisation problem using Java Document obejct
    812986 Newbie
    Currently Being Moderated
    Hi,

    As mentioned earlier the problem is when the unicode value of ampersand is mentioned "& # 26 ;" in the text node , the output after serialization comes as
    "& amp ; # 26;" which is obviously not correct. The output I was expecting was same unicode value "& # 26 ;".

    Please note I have intentionally mentioned the spaces between the characters as after uploading the query its getting modified. Read it as if there is no space in between the characters.

    Thanks.

    Edited by: 809983 on Nov 11, 2010 10:06 PM

    Edited by: 809983 on Nov 11, 2010 10:07 PM
  • 4. Re: XML serialisation problem using Java Document obejct
    jtahlborn Expert
    Currently Being Moderated
    JCP wrote:
    As mentioned earlier the problem is when the unicode value of ampersand is mentioned "& # 26 ;" in the text node , the output after serialization comes as
    "& amp ; # 26;" which is obviously not correct.
    this is your wrong assumption. as i stated, this is correct. please read up on the details of xml serialization. also, like i said above, if you read this data with an xml parser, you will get back exactly the text you wrote.
    The output I was expecting was same unicode value "& # 26 ;".
    you are expecting wrong.
  • 5. Re: XML serialisation problem using Java Document obejct
    812986 Newbie
    Currently Being Moderated
    OK. In that case my What do you suggest I need to do in order to get what I desire. If not XML parser then what are the other options I should go for serializing the Document object into XML file. My basic purpose is that I need that Unicode value to be the same before and after serialization. That is I need the value "& # 26;" to be present on the physical XML file that is generated. Please suggest

    Thanks

    Edited by: JCP on Nov 12, 2010 6:38 AM
  • 6. Re: XML serialisation problem using Java Document obejct
    jtahlborn Expert
    Currently Being Moderated
    why do you require the physical xml file to look like what you expect? like i said, any xml parser should handle that data correctly when you try to read it.
  • 7. Re: XML serialisation problem using Java Document obejct
    812986 Newbie
    Currently Being Moderated
    Hey Thanks for the clarification but that does not fulfill my requirement. I will tell you what exactly is that is needed.....I have a physical XML file which has ampersand value in ASCII format (& amp ; ).....I need to parse this value from the physical XML file and store in another new XML file but in Unicode format which is (& # 26; )....Now I parse using DOM obejct the value I am getting is & amp ; from the first XML file, then I create a text node to store replace this value with unicode format which (& # 26;) in a new DOM document....but when I serialize this document the value changes to (& amp ; # 26; ) which not the correct unicode format of ampersand as I have a reader which should show & but it showing & # 26 ;.

    Thanks
  • 8. Re: XML serialisation problem using Java Document obejct
    jtahlborn Expert
    Currently Being Moderated
    i'm sorry that doesn't fulfill your requirement. the xml serializer is fulfilling the requirements of an xml serializer, which say it can write an "&" in the data as "& amp;".
  • 9. Re: XML serialisation problem using Java Document obejct
    812986 Newbie
    Currently Being Moderated
    jtahlborn wrote:
    i'm sorry that doesn't fulfill your requirement. the xml serializer is fulfilling the requirements of an xml serializer, which say it can write an "&" in the data as "& amp;".
    I am not saying that xml serializer can write & instead of "& amp;". But it should also write "& # 26;" (unicode) as & as its also a valid case which it is not doing.


    Anyway can you suggest any other solution/aprroach that I should try for resolving my requirement?

    Thanks.

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points