7 Replies Latest reply: Feb 18, 2013 9:06 AM by mycoffee RSS

    How to save memory when processing large result set

    mycoffee
      I need to dump multi millions of rows of data into excel files

      I query the tables and open excel to write in

      The problem is even I chopped the result into hundred files, close excel completely after 65536 rows, the memory usage keeps going up as the result set is looped and at one point hit the heap size

      How can I release the memory has been used in the result set?

      Thank you
        • 1. Re: How to save memory when processing large result set
          aksarben
          My first suspicion would be that you have a memory leak somewhere.
          • 2. Re: How to save memory when processing large result set
            mycoffee
            aksarben wrote:
            My first suspicion would be that you have a memory leak somewhere.
            Nope. I am sure no leaking
            I watch the memory going up and down
            It dropped to initial value right after the process done

            The only thing I don't know is how to force the RS release the memory of the rows already processed
            • 3. Re: How to save memory when processing large result set
              939520
              I think resultSet.close() will do what you want (you shouldn't have to set resultSet=null when you're done with it).

              You can't force the garbage collector to run and reclaim memory. It uses an intelligent algorithm to do so .

              I question why your project is sending millions of records to excel. Who is going to read a 10,000 page excel document(s)?
              Instead, I suggest you provide a (intelligent) filter mechanism to allow users to get a subset of data to send to an excel document rather than all data. For example: instead of sending him the entire telephone book, have him search for results based on lastName and/or firstName. That will cut down on the number of records returned. Next, does the user really need all the columns of data in each record? That will cut it down further.

              You can search Google for 'java heap size' to increase the memory for your program. However, your 65536 limit is probably due to Excel's limitation and not your Java program.
              • 4. Re: How to save memory when processing large result set
                mycoffee
                936517 wrote:
                I think resultSet.close() will do what you want (you shouldn't have to set resultSet=null when you're done with it).

                You can't force the garbage collector to run and reclaim memory. It uses an intelligent algorithm to do so .

                I question why your project is sending millions of records to excel. Who is going to read a 10,000 page excel document(s)?
                Instead, I suggest you provide a (intelligent) filter mechanism to allow users to get a subset of data to send to an excel document rather than all data. For example: instead of sending him the entire telephone book, have him search for results based on lastName and/or firstName. That will cut down on the number of records returned. Next, does the user really need all the columns of data in each record? That will cut it down further.

                You can search Google for 'java heap size' to increase the memory for your program. However, your 65536 limit is probably due to Excel's limitation and not your Java program.
                Sorry I could not explain the need,
                No. That is not issue here. I already use max heap size I can
                but I can handle it now. Open files, write directly to the file instead of holding the data and dumping all at once. I save all the overhead and it works fine even the result set still consumes almost all the memory.
                • 5. Re: How to save memory when processing large result set
                  jtahlborn
                  mycoffee wrote:
                  936517 wrote:
                  I think resultSet.close() will do what you want (you shouldn't have to set resultSet=null when you're done with it).

                  You can't force the garbage collector to run and reclaim memory. It uses an intelligent algorithm to do so .

                  I question why your project is sending millions of records to excel. Who is going to read a 10,000 page excel document(s)?
                  Instead, I suggest you provide a (intelligent) filter mechanism to allow users to get a subset of data to send to an excel document rather than all data. For example: instead of sending him the entire telephone book, have him search for results based on lastName and/or firstName. That will cut down on the number of records returned. Next, does the user really need all the columns of data in each record? That will cut it down further.

                  You can search Google for 'java heap size' to increase the memory for your program. However, your 65536 limit is probably due to Excel's limitation and not your Java program.
                  Sorry I could not explain the need,
                  No. That is not issue here. I already use max heap size I can
                  but I can handle it now. Open files, write directly to the file instead of holding the data and dumping all at once. I save all the overhead and it works fine even the result set still consumes almost all the memory.
                  is it possible you are using mysql? the mysql jdbc driver has a terrible default setup in that it keeps all results for the result set in memory ! i think some of the latest drivers finally allow you to stream results sensibly, but you have to use the correct options.
                  • 6. Re: How to save memory when processing large result set
                    761757
                    Just a thought:

                    Wont it be a better approach in your case to start a new process using Processbuilder to call a bat or sh file which runs the query from sqlplus , spool the data to a csv or xls (xlsx has a limit over 1 lakh rows ) and once its done you use this file. This would not impact your heap and as per my experience I have found running sqlplus from command line has been pretty quick.

                    Am not sure this approach works for you or not.
                    • 7. Re: How to save memory when processing large result set
                      mycoffee
                      Solution:

                      ps = con.prepareStatement(sql,
                                               ResultSet.TYPE_FORWARD_ONLY,
                                               ResultSet.CONCUR_READ_ONLY,
                                               ResultSet.CLOSE_CURSORS_AT_COMMIT);