5 Replies Latest reply: Jul 19, 2011 10:54 AM by DrClap RSS

    JAXP DOM reading and writting issues

    838975
      import org.xml.sax.*;
      import org.w3c.dom.*;
      import javax.xml.parsers.*;
      import javax.xml.transform.*;
      import javax.xml.transform.dom.DOMSource;
      import javax.xml.transform.stream.StreamResult;
      
      import java.io.*;
      
      public class Test
      {   public static void main(String[] args) throws Exception
          {     DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
              dbf.setIgnoringElementContentWhitespace(true);
           DocumentBuilder db = dbf.newDocumentBuilder();
      
           read(db, "xml_in.xml");
           create(db,"xml_out.xml");
          }
      
          public static void read(DocumentBuilder db, String fileName)
          {     Document d = null;
           try
           {   d = db.parse(new File(fileName));
           }catch(IOException ex)
           {   ex.printStackTrace();
               return;
           }catch(SAXException ex)
           {   ex.printStackTrace();
               return;
           }
      
           Node n = d.getDocumentElement();
           System.out.println("Name of the root element: " + n.getNodeName());
      
           NodeList nl = null;
           nl = d.getElementsByTagName("*");
           System.out.println("Number of element:" + nl.getLength());
           System.out.println();
      
           nl = d.getElementsByTagName("user");
           System.out.println("Length:" + nl.getLength());
      
           for(int i=0; i<nl.getLength(); i++)
           {   System.out.println("id: "+nl.item(i).getAttributes().getNamedItem("id").getNodeValue());
               System.out.println("name: "+nl.item(i).getFirstChild().getTextContent());
           }
          }
      
          public static void create(DocumentBuilder  db, String fileName) throws Exception
          {
      
           Document d = db.newDocument();
      
           d.appendChild(d.createComment("This is comment"));
      
           Element ele_root = d.createElement("root");
           d.appendChild(ele_root);
      
           Element ele_temp;
      
           ele_temp = d.createElement("sub");
           ele_temp.setAttribute("id","1");
           ele_temp.appendChild(d.createTextNode("data"));
           ele_root.appendChild(ele_temp);
      
           ele_temp = d.createElement("sub");
           ele_temp.setAttribute("id","2");
           ele_root.appendChild(ele_temp);
      
           //adding node
           NodeList nl = d.getElementsByTagName("sub");
           Element ele_parent = (Element)nl.item(0).getParentNode();
           ele_temp = d.createElement("sub");
           ele_temp.setAttribute("id","3");
           ele_parent.appendChild(ele_temp);
      
           d.normalize();
           Transformer t = TransformerFactory.newInstance().newTransformer();
           //t.setOutputProperty(OutputKeys.METHOD, "test");
           t.transform(new DOMSource(d), new StreamResult(new File(fileName)));
          }
      }
      xml_in.xml
      <?xml version = "1.0" ?>
      <user-detail>
      
           <user     id = "1"><name>user1</name><age>10</age></user>
      
           <user     id = "2">
                <name>user2</name>
                <age>20</age>
           </user>
      
           <user     id = "3">
                <name>user3</name>
                <age>30</age>
           </user>
      </user-detail>
      The result from read(db, "xml_in.xml"):
      >
      Name of the root element: user-detail
      Number of element:10

      Length:3
      id: 1
      name: user1
      id: 2
      name:
                
      id: 3
      name:
      >
      both name of id 2 and 3 are missing due to the spacing, how to remove the spacing to avoid this problem?


      The result from create(db,"xml_out.xml"):
      <?xml version="1.0" encoding="UTF-8" standalone="no"?><!--This is comment--><root><sub id="1">data</sub><sub id="2"/><sub id="3"/></root>
      how to set it nicely as below?
      <?xml version="1.0" encoding="UTF-8" standalone="no"?>
      <!--This is comment-->
      <root>
           <sub id="1">data</sub>
           <sub id="2"/>
           <sub id="3"/>
      </root>
      thanks~
        • 1. Re: JAXP DOM reading and writting issues
          798692
          anIdiot wrote:
               d.normalize();
               Transformer t = TransformerFactory.newInstance().newTransformer();
               //t.setOutputProperty(OutputKeys.METHOD, "test");
               t.transform(new DOMSource(d), new StreamResult(new File(fileName)));
          Set the output property for the transformation like the following.
          d.normalize();
          Transformer t = TransformerFactory.newInstance().newTransformer();
          t.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "2");
          t.setOutputProperty(OutputKeys.INDENT, "yes");
          t.transform(new DOMSource(d), new StreamResult(new File(fileName)));
          • 2. Re: JAXP DOM reading and writting issues
            DrClap
            Removing the whitespace is the wrong approach. Instead you should realize that the creators of that XML can put in whitespace wherever they feel like it, and more importantly, that the whitespace is also part of the document. In particular it forms text nodes which are children of the element in which they are located, just as element nodes are children.

            So your strategy of assuming that the "name" element will be the first child of the "user" element is incorrect. When there is whitespace before the "name" element, that whitespace will be the first child. So your strategy should be to get the "name" element which is a child of the "user" element.

            And by the way, calling the "normalize" method of the DOM won't do anything to affect that. As usual Ram manages to provide un-useful information.
            • 3. Re: JAXP DOM reading and writting issues
              838975
              the code below doesnt work
              may i know how to get the "name" element which is a child of the "user" element?
              System.out.println("name: "+nl.item(i).getChildNodes().getElementsByTagName("name").getTextContent());
              • 4. Re: JAXP DOM reading and writting issues
                798692
                anIdiot wrote:
                the code below doesnt work
                may i know how to get the "name" element which is a child of the "user" element?
                You can go with XPath to parse the xml document based on the elements, rather than using DOM. You can go with the DOM only if you want to process the whole xml document.
                • 5. Re: JAXP DOM reading and writting issues
                  DrClap
                  ram wrote:
                  You can go with XPath to parse the xml document based on the elements...
                  +1 to that.

                  If you want to focus on specific nodes like that, then XPath is much better than a ton of DOM-bashing code.