9 Replies Latest reply: Mar 20, 2009 7:18 PM by 800308 RSS

    XML API

    843785
      I'm interested in working on XML and I came across JDOM. I was wondering if that is the "best" XML API currently. If not, what API should I look into? Any help would be appreciated. Thank you.
        • 1. Re: XML API
          843785
          Define "best".

          Life was easier before Java:

          Eve: Do you love me, Adam?
          Adam: Who else?

          (Joke is much funnier with Borscht-Belt accents.)
          • 2. Re: XML API
            843785
            This answer will sound really vague and trite, but in the case of XML processing, it actually carries some more weight than in other cases: it depends entirely on what you want to do with your XML

            Currently, I deal with a fair amount of XML in my work, but rarely actually write any Java XML code at all. It's all done with XSLT and binding frameworks. Doesn't make that the right approach for every task, by a long way
            • 3. Re: XML API
              843785
              well I'm reading this Java XML guide that was written a few years back. It said JDOM allows you to access any part of the DOM tree at any time unlike SAX and it's much simpler than DOM. So I was wondering if there's any other API's out there now that does the same and maybe more. Also, what's the most popular XML API for Java people use? (I guess that could give me an indication of "best")
              • 4. Re: XML API
                843785
                I like an API with [StAX Appeal|http://today.java.net/pub/a/today/2006/07/20/introduction-to-stax.html]

                But I also like JAXB and XMLBeans

                And if we're talking about parsing configs, you can abstract away the XML and use [http://commons.apache.org/configuration/]
                • 5. Re: XML API
                  843785
                  GlideKensington wrote:
                  Also, what's the most popular XML API for Java people use? (I guess that could give me an indication of "best")
                  Not so. At all
                  • 6. Re: XML API
                    Puce
                    For importing/ exporting data from/ to XML I recommend using JAXB, if you don't have a good reason not to use it.
                    • 7. Re: XML API
                      843785
                      Package org.w3c.dom is built around language-agnostic guide-lines.

                      JDOM is a replacement for the DOM API, and it uses Java's features like Collections etc. where applicable, as opposed to an API providing counts and element-getters but no iterators. Therefore JDOM is more convenient and it better fits into a framework with, say, Velocity.


                      But DOM is only a part of the XML picture.
                      • 8. Re: XML API
                        843785
                        Thanks! That really helped me.

                        Thank you for the others who also gave suggestions.
                        • 9. Re: XML API
                          800308
                          h1. DOM

                          DOM's are only useful for small (say <=1 Mb) XML documents because DOM is very memory hungry.

                          The rule of thumb is: minimumDomRam = sizeOfXmlFileOnDisk * 3... and that's a minimum, I've seen that go as high as 6 * sizeOfXmlFile.

                          DOM keeps the whole of the parsed document-tree in memory, which can be especially problematic in a Java server, which must by it's nature be able to satisfy many concurrent client requests, which means no single user or process can be allowed to "hog resources". So on the server-side we mainly use SAX Parsers, or a "higher-level" XML-Binding technology like XMLBeans (or JAXB apparently, though I've never personally used it, because most of our stuff all predates JAXB's release, and why swap horses now).

                          <snip>I was writing a treatise on virtual memory, thrashing, and how to avoid it... but you can google those terms and find more instructive articles than anything I could write.</snip>

                          DOMs' strength is that it facilitates modifing the contents of the document directly in code. So you can build a new DOM document (or deserialize and then modify an existing one) and then just automagically serialize it. Marvelous! (so long as that process only ever handles small XML documents!).

                          h1. JDOM

                          JDOM: (as previously stated) is a java-native-DOM... In my mind it sits between a "real language agnostic W3C DOM" and the XML-binders like XMLBeans. It gives you much of the functionality of a DOM with a significantly smaller memory footprint and much faster (though still not brilliant) processing primitives. But it's still suitable only for small-ish XML documents (say < 2 or maybe 3 Mb, depending all sorts of stuff).

                          h1. SAX

                          SAX parsers process a "stream of elements" meaning that you only keep each xml-element's contents in memory for as long as you need to process that element... so (if you're sane) you don't ever have the whole document in memory at once, hance you use a ship-load less RAM than you would for the equivalent DOM. SAX's "callbacks" are very fast (as fast as can be). So SAX is the only way to go when you handle even-sometimes-big (say 5 Mb plus) documents.

                          SAX doesn't facilitate modifying the contents of the document directly, though you can use XSLT's (transforms) to modify existing models, this process is very cumbersome and technically terribly confusterpating (ergo: basically pretty sh1tty), and therefore (IMHO) should (almost always) be avoided.

                          h1. StAX

                          StAX sort-of sits in between DOM and SAX.

                          StAX also processes XML as a stream-of-elements. Conceptually speaking: StAX gives you forward-only element-iterator on the document-tree. It differs from SAX in that your program "pulls" the elements as you require them; where-as SAX logically "pushes" the elements to you to deal with as they occure in the document. In some circumstances, "pulling" the data as you need it can dramatically reduce the "statefullness" (the number/size of elements you need to remember at any one time) inherent in the process.

                          StAX excels at processing large (not huge) documents with relatively complex (i.e. inherently stateful) schemas. The downside of StAX is that (IMHO) your parse-code is more complex than the SAX equivalent. StAX operations are also a tad slower than there SAX equivalents.

                          If I wrote my own XML-binding-code-generator it would use StAX under the hood.

                          h1. Binders

                          XML-binders allow you to construct and/or modify an "object graph" (a tree-like-data-structure of Java objects which represent the same "model" as the XML). This graph is logically equivalent to a DOM. You still have the whole object-graph in memory at one time, but because they're "native objects" (as apposed to an abstract model of those objects) there's a lot less overhead; meaning that (depending on the nature of the data therein) an object graph tends to be much much smaller in memory than it's equivalent XML document (GOOD!) and the operations are much much faster because native object offer native accessors and mutators (GOOD! GOOD!).

                          ------

                          If anyone strongly disagrees with anything I've said I'm all ears... This is just my opinion, and I stand (or sit, as the case may be) to be corrected.

                          Cheers. Keith.

                          Edited by: corlettk on 21/03/2009 10:15 ~~ Can't spell, can't type, can't code. Darn!