1 Reply Latest reply on Oct 9, 2019 3:14 PM by bp9109

    Error in SmartHeap, infinite loop

    bp9109

      We have Documaker/Docupresentment set up on our server and talk to it through EWPS and MQ.  Users edit forms in the WIPEdit plugin, and can also attach external PDFs with ADDMULTIPAGEBITMAP calls.

       

      When a user attaches too large of a PDF the IDS instance they hit uses massive amounts of memory to process it.  It looks like it's breaking down the PDF into jpg pages fine, but once it pulls those pages in to add to the WIP record the memory starts climbing.  One large attachment can use 1gb of memory and max out CPU for however long it takes to consume it.

       

      If the only problem was that it took awhile to process and caused performance problems, it'd be manageable.  The real problem is that occasionally it will reach some breaking point and this error below will start getting written to watchdog-stdout.txt dozens of times per second in an infinite loop.

       

           Error in SmartHeap
           Code 2
           File Unknown

       

      The log file gets larger and larger until it's gigabytes in size and the server starts to become unresponsive to all other users.  The fix is for us to log into the server and restart the docupresentment service, which clears out the IDS instances and also deletes the log file.

       

      I understand that there is a limit to attachments, and that we can approach it by limiting the size of what users can add.  But I don't want to jump to that without understanding the problem first.  Our company deals with legal documents and medical records, and some of them are very lengthy.  So I want to make sure we try every other possible fix first, since the inability to add those attachments might be a dealbreaker with the business users.

       

      I've tried a lot of things, but can't quite pin it down.  I've tried switching it to UseImageExportOnly=YES, I've tried bumping the -Xmx values on the IDS instances and the tomcat server, tried upgrading java, and bumping memory on the server.  Tried lots of other things I don't really remember at this point.

       

      It seems to be a memory issue at the heart of it, but it doesn't seem to be a limit on the IDS instance itself.  I've seen one instance go up to 1.2gb and hold, eventually completing successfully, and I've seen it start to error at 750mb.  When the server is freshly rebooted it's harder to recreate the error, meaning it takes a larger/higher page count file to recreate in one transaction.  As time goes on it becomes easier to recreate with a (relatively) smaller file.  So maybe it's an aggregate of all the IDS instances' and/or the tomcat server's mem usage.  I haven't been able to pin that down.

       

      I'd much rather it give a single out of memory error and cancel that one transaction than take down the whole server.

       

      One other interesting fact: it seems to be related to the internal processing of the attached pages into Documaker or the WIP record that causes it so much trouble, not the file's size itself.  I can recreate it with a 600kb PDF that has 120 pages of text in it.  The entire pdf is 600k, but when it splits the pages apart to jpg each one is larger than the original entire document since it's now an image instead of a compressed text doc.  So even limiting the users to a max file size wouldn't really prevent it completely.  We'd have to implement some third party tool to get a page count to truly limit it.

       

      Does any of that make any sense?  Is there anything else we can try?

        • 1. Re: Error in SmartHeap, infinite loop
          bp9109

          The error seems to be resolved when I lower the max heap size on the IDS instances from 1024m to 512m (-Xms512m -Xmx512m) in docserv.xml.  I found this when I noticed that raising the heap memory to 1536m caused the error to happen much sooner, so I tried lowering it.  Does that make sense to anyone?  Is there a limitation in the SmartHeap component on how much heap memory it can handle at once?  Or is it conflicting with another config setting that I need to adjust to allow that much heap?