This discussion is archived
13 Replies Latest reply: Jan 8, 2009 5:00 AM by 807589 RSS

Creating large XML files

807589 Newbie
Currently Being Moderated
I have been working with DOM parser for creating XML files but as the size of the XML file increases, the memory allocated to the JVM gets consumed at a high rate. So i needed to find another parser that can make my life much easier. SAX parser is not a solution for me as it gives problems with doc manipulation and a lot of complex coding will be involved in callback functions.

Thank you in advance
  • 1. Re: Creating large XML files
    807589 Newbie
    Currently Being Moderated
    If you want any document model, then it will consume a lot of memory.
    How big is your XML file and what is your maximum memory size?
  • 2. Re: Creating large XML files
    807589 Newbie
    Currently Being Moderated
    Memory allocated to JVM is 129792kb

    Size of XML file to be created is Greater than 1Gb
  • 3. Re: Creating large XML files
    807589 Newbie
    Currently Being Moderated
    vivek_kumar_kohli wrote:
    Size of XML file to be created is Greater than 1Gb
    really ?
  • 4. Re: Creating large XML files
    807589 Newbie
    Currently Being Moderated
    Ya man.
    I have to insert lot of images in the form of string.
  • 5. Re: Creating large XML files
    807589 Newbie
    Currently Being Moderated
    try XSLT parsing with SAX .
    u just need to override startElement , endElement and characters () these mothods
  • 6. Re: Creating large XML files
    807589 Newbie
    Currently Being Moderated
    Are you sure that you want to be doing that? Typically you would just store references (most likely pathnames) to the image files. Either way, if you have a 1GB XML file, you are likely to use more than 1GB of memory since the data needs to be somewhere.

    I would suggest to write code that targets your requirements efficiently. If you know what nodes you are trying to reach, then write code that will just pull those out (whether it be SAX-based or some form of lazy DOM-loading). If indeed loading the whole DOM is the best solution, then take the images out of the XML file.
  • 7. Re: Creating large XML files
    807589 Newbie
    Currently Being Moderated
    i.d.e. wrote:
    Are you sure that you want to be doing that? Typically you would just store references (most likely pathnames) to the image files. Either way, if you have a 1GB XML file, you are likely to use more than 1GB of memory since the data needs to be somewhere.
    means what ?
    I would suggest to write code that targets your requirements efficiently. If you know what nodes you are trying to reach, then write code that will just pull those out (whether it be SAX-based or some form of lazy DOM-loading). If indeed loading the whole DOM is the best solution, then take the images out of the XML file.
    i dont knw what r u trying to convey .
  • 8. Re: Creating large XML files
    807589 Newbie
    Currently Being Moderated
    My response was to the topic creator, but for clarification regardless...

    It sounds like (s)he is encoding images as strings (e.g. base64) and storing that data in the XML file itself. My proposal was to go with a more traditional solution and just store the pathnames or URLs or whatever to the images.

    So, instead of doing something like (get ready for some XHTML...):
    <object data="[encoded data]+"/>
    I was suggesting to do:
    <object data="images/my-image.png"/>

    My second paragraph was just suggesting some methods that would be less memory-intensive at a given time than storing the whole DOM tree in memory would be.
  • 9. Re: Creating large XML files
    807589 Newbie
    Currently Being Moderated
    If you want to use the DOM for XML you are going to need around 4 GB of memory on a 64-bit JVM/OS.

    If you don't have a 64-bit OS or this much memory you need to change the structure of your program so you don't load the whole file at once.
  • 10. Re: Creating large XML files
    807589 Newbie
    Currently Being Moderated
    AmitChalwade123456 wrote:
    i.d.e. wrote:
    Are you sure that you want to be doing that? Typically you would just store references (most likely pathnames) to the image files. Either way, if you have a 1GB XML file, you are likely to use more than 1GB of memory since the data needs to be somewhere.
    means what ?
    A 1GB XML file in UTF-8 represented in DOM will typically take 2GB of Java memory for the strings (each char takes 2 bytes), and about half to twice the same again for objects to represent elements and attributes (depending on how much mark-up is present in the document).

    If you run Java with less memory than that, it won't fit into the space available.
    i dont knw what r u trying to convey .
    Is there a problem with your keyboard? Some of your vowels don't seem to be working correctly.
  • 11. Re: Creating large XML files
    807589 Newbie
    Currently Being Moderated
    vivek_kumar_kohli wrote:
    I have been working with DOM parser for creating XML files
    Eh? The parser is for reading xml files, not creating them. You can create them using from a DOM using the Transformer mechanism, but that's not about the parser.
    but as the size of the XML file increases, the memory allocated to the JVM gets consumed at a high rate. So i needed to find another parser that can make my life much easier. SAX parser is not a solution for me as it gives problems with doc manipulation and a lot of complex coding will be involved in callback functions.
    It's the nature of the beast, either you build the whole document tree in memory, in which case you need the whole thing stored and lots of RAM, or you process it looking at a small part at a time, which is going to be more complicated.

    Using SAX isn't that complicated, basically I just use a different ContentHandler for each significant element type. You work with a stack of content handlers, pushing a new one on the stack when a tag opens, and popping it off when the element closes.

    This kind of code replaces code which walks the DOM tree, and it's really not all that different in overall structure.

    Outputting XML isn't that hard to do with simple println calls.
  • 12. Re: Creating large XML files
    807589 Newbie
    Currently Being Moderated
    Outputting XML isn't that hard to do with simple println calls.
    If you know the spec well, and know the exact rules for each piece of data in the document, then sure. If you don't, then it's really easy to generate XML that is not well-formed and therefore can't be parsed.
  • 13. Re: Creating large XML files
    807589 Newbie
    Currently Being Moderated
    It sounds to me like you're trying to use xml as some kind of database for storing images. Where does the requirement come from, that such a big chunk all be stored in one xml?
    I would, as others have suggested, rather refer to file paths or an actual database resource instead of getting myself in the trouble you're in. If it's a demand from someone in your company, just tell them that you looked for help on this forum, and that everyone here thought it was a stupid idea ;)