8 Replies Latest reply: May 25, 2012 8:18 AM by Moon123 RSS

    Compare 2 XML files

    Moon123
      I would like to write a Java code to compare 2 XML files. The code should be able to find if both files have the same nodes, attributes, values etc and if not, display the differences. I am very new to XML parsing and I have been googling this for a while now but I am only getting more confused. I came across DOM, SAX, XOM, XSLT, xpath, XMLUnit, Oracle's XDK 10g, etc but still not sure how to go about doing this. I would appreciate your help very much.
      Thank you in advance.
        • 1. Re: Compare 2 XML files
          gimbal2
          DOM would be the easiest way as that gives you an object structure to work with. Google for "java xml dom example".
          • 2. Re: Compare 2 XML files
            jtahlborn
            Also, note that comparing xml can be very tricky (since "equivalent" xml can be serialized in very different forms). In order to do it accurately you may need to look into xml normalization and/or canonicalization.
            • 3. Re: Compare 2 XML files
              Moon123
              Thank you both.
              I am going to do the following steps. Do they seem correct or the best way to you guys?

              1) Parse in 2 XML files using SAX (faster and uses less memory)
              2) Store elements/value pair in Hashmaps
              3) Compare the two Hashmaps to see if they both have the same value for the same keys (elements)
              4) print out the differences if they are not the same

              Or is there a better or simpler way of doing it?

              Thanks a lot.
              • 4. Re: Compare 2 XML files
                gimbal2
                Moon123 wrote:
                Thank you both.
                I am going to do the following steps. Do they seem correct or the best way to you guys?

                1) Parse in 2 XML files using SAX (faster and uses less memory)
                Those are not good reasons to use SAX in this case. Its also far more cumbersome, unfriendly to changes in the XML and a heck of a lot more work. But you have more points...
                2) Store elements/value pair in Hashmaps
                (that would eat up a large amount of memory again, defeating one of your reasons for using SAX)
                3) Compare the two Hashmaps to see if they both have the same value for the same keys (elements)
                4) print out the differences if they are not the same
                That COULD work, if you had some wicked cleanup routines. Remember: a single space can already make two strings unequal. The ordering of elements and attributes is another thing that will make your life very difficult.
                • 5. Re: Compare 2 XML files
                  Moon123
                  I understand gimbal2 but what do you suggest I do instead?
                  • 6. Re: Compare 2 XML files
                    gimbal2
                    Uh what has already been said? This is right now the flow of this thread:

                    1) question
                    2) answers
                    3) ignoring answers and going in a completely different direction
                    4) getting advice
                    5) asking what else to do

                    Whatever you want dude. Whichever path you take its going to be long and boring work, but my gut tells me that using DOM will make it slightly less work.
                    • 7. Re: Compare 2 XML files
                      Moon123
                      @gimbal2:
                      No, that is not the way this thread is going at all... I got the answers to my first question and I then moved on to the second part. I thought that was rather obvious.
                      When I asked about using hashmaps, you said "that COULD work" which made me think that maybe you have a better solution than hashmaps that could make my life a bit easier but apparently NOT....so that's fine.


                      @Everyone else
                      I also had some questions about comparing hashmaps but I just found out that I can use

                      public static <K,V> MapDifference<K,V> difference(Map<? extends K,? extends V> left,
                      Map<? extends K,? extends V> right)

                      of google guava library to compare two Hashmaps.

                      Edited by: Moon123 on May 23, 2012 3:59 PM
                      • 8. Re: Compare 2 XML files
                        Moon123
                        I think I got what I was looking for. Thank you all.