This discussion is archived
1 2 Previous Next 18 Replies Latest reply: May 24, 2011 4:29 PM by jschellSomeoneStoleMyAlias RSS

OutOfMemoryError while iterating through a ResultSet

860579 Newbie
Currently Being Moderated
NOTE
I "think" I might have the answer. There must be metadata stored with each fetch. I am going to do a run using: "+clearWarnings()+."
This test will take about 5-minutes....

I am:
(1) reading in one column per row from a mysql table.
(2) the column is a utf-8 encoded String.
(3) the String can be upto 256 characters long.
(4) I have about 80,000 rows to process.
If the integer value of a character falls within a range, I store it in an output table;

My program seems to work fine.
After 5 minutes of running, it inserts 3500 characters into the output table.
3500 is a reasonable result.
However, the program does eventually crash with the following stack trace:

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOfRange(Arrays.java:3209)
at java.lang.String.<init>(String.java:216)
at java.lang.StringBuffer.toString(StringBuffer.java:585)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:2928)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1571)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:1666)
at com.mysql.jdbc.Connection.execSQL(Connection.java:2988)
at com.mysql.jdbc.Connection.execSQL(Connection.java:2917)
at com.mysql.jdbc.Statement.execute(Statement.java:529)
at alpha.DataUtil.filterCharacters(DataUtil.java:26) // <-- line #26
at alpha.Main.main(Main.java:11)
Java Result: 1


static void filterCharacters() throws Exception {
  Connection connQuery  = DriverManager.getConnection("jdbc:mysql://localhost/alpha", "root", "xxx");
  Connection connInsert = DriverManager.getConnection("jdbc:mysql://localhost/beta", "root", "xxx");

  Statement stmt = connQuery.createStatement();
  ResultSet rslt = stmt.executeQuery("select longstring from documents;");
  rslt.next();

  while(!rslt.isLast()) {
    String longString = rslt.getString(1);
    for(int i = 0; i < longString.toCharArray().length; i++) {
      char c = longString.charAt(i);
      if(Util.isCharacterOfInterest(c)) {
        Statement st = connInsert.createStatement();
        try {
          st.execute("INSERT INTO interestingCharacters(character) VALUES('" + c + "');"); // <-- line #26
        } catch(SQLException e) { continue; } // catch/ignore primary key collision
      }
    }
    rslt.next();
  }
}
I don't see where a build-up of unreleased objects might be.
And, I thought ResultSet is designed to handle huge data fetches.

Edited by: azjp on May 19, 2011 1:07 PM
  • 1. Re: OutOfMemoryError while iterating through a ResultSet
    796440 Guru
    Currently Being Moderated
    May or may not be related to your problem, but this is always wrong:
              st.execute("INSERT INTO interestingCharacters(character) VALUES('" + c + "');"); // <-- line #26
    The correct way is:
    PreparedStatement ps = con.prepareStatement("INSERT INTO interestingCharacters(character) VALUES(?)"); // no semicolon needed in SQL
    ps.setString(1, c); // or may have to do String.valueOf(c)
    ps.executeUpdate();
    A more likely cause of your problem is that you're missing a ps.close() in a finally block, so you're probably accruing an unbounded amount of resources (memory and others) in the Connection.
  • 2. Re: OutOfMemoryError while iterating through a ResultSet
    796440 Guru
    Currently Being Moderated
    This is also a bad idea:
            } catch(SQLException e) { continue; } // catch/ignore primary key collision
    It assumes the PK collision is the only error that can occur, and provides no details about what went wrong. At the very least, you should be calling e.printStackTrace().
  • 3. Re: OutOfMemoryError while iterating through a ResultSet
    860579 Newbie
    Currently Being Moderated
    jverd wrote:
    May or may not be related to your problem, but this is always wrong:
    st.execute("INSERT INTO interestingCharacters(character) VALUES('" + c + "');"); // <-- line #26
    The correct way is:
    PreparedStatement ps = con.prepareStatement("INSERT INTO interestingCharacters(character) VALUES(?)"); // no semicolon needed in SQL
    ps.setString(1, c); // or may have to do String.valueOf(c)
    ps.executeUpdate();
    A more likely cause of your problem is that you're missing a ps.close() in a finally block, so you're probably accruing an unbounded amount of resources (memory and others) in the Connection.
    My executing "connection.clearWarnings()" helped it process many more rows, but it still crashed, but with a different Exception::

    Exception in thread "main" Java Result: 1
    BUILD SUCCESSFUL (total time: 9 minutes 31 seconds)

    I will incorporate your suggestions, and then do some further testing.

    thanks.
  • 4. Re: OutOfMemoryError while iterating through a ResultSet
    DrClap Expert
    Currently Being Moderated
    You probably have a scrollable ResultSet. Actually since the isLast() method works, you must have one. And since you gave yourself the ability to scroll through the ResultSet, it appears the JDBC driver is caching all of the rows, just in case you do that.

    So try using a forward-only ResultSet. Then the driver won't need to cache all of the rows, since you can't go back and access them after you've seen them. You'll have to change your code to use the standard way of going through it:
     while (rs.next()) {
      // handle the current row
    }
    But it's really better to use the standard idiom rather than something you made up yourself, even if what you made up yourself works perfectly well.
  • 5. Re: OutOfMemoryError while iterating through a ResultSet
    796440 Guru
    Currently Being Moderated
    DrClap wrote:
    You probably have a scrollable ResultSet. Actually since the isLast() method works, you must have one.
    Good eye!
  • 6. Re: OutOfMemoryError while iterating through a ResultSet
    Tolls Journeyer
    Currently Being Moderated
    jverd wrote:
    May or may not be related to your problem, but this is always wrong:
    st.execute("INSERT INTO interestingCharacters(character) VALUES('" + c + "');"); // <-- line #26
    The correct way is:
    PreparedStatement ps = con.prepareStatement("INSERT INTO interestingCharacters(character) VALUES(?)"); // no semicolon needed in SQL
    ps.setString(1, c); // or may have to do String.valueOf(c)
    ps.executeUpdate();
    A more likely cause of your problem is that you're missing a ps.close() in a finally block, so you're probably accruing an unbounded amount of resources (memory and others) in the Connection.
    In addition to this, I would shift the declaration of the PreparedStatement out of the while loop since you're using the same one anyway. No point recreating it each time, since you're merely changing the setXXX() value.

    Not that that appears to be your problem...just thought I'd mention it.
  • 7. Re: OutOfMemoryError while iterating through a ResultSet
    796440 Guru
    Currently Being Moderated
    Good point. I should have made that clear. Thanks.
  • 8. Re: OutOfMemoryError while iterating through a ResultSet
    802889 Explorer
    Currently Being Moderated
    MySQL by default fetches the full resultset[1], so most likely you are already almost at the end of available memory, and the insert simply is the final drop. You could try to set the fetchsize to Integer.MIN_VALUE (using setFetchSize() on the statement) to disable fetching the entire resultset.

    1: http://dev.mysql.com/doc/refman/5.0/en/connector-j-reference-implementation-notes.html (under ResultSet)
  • 9. Re: OutOfMemoryError while iterating through a ResultSet
    860579 Newbie
    Currently Being Moderated
    TheAvalanche wrote:
    MySQL by default fetches the full resultset[1], so most likely you are already almost at the end of available memory, and the insert simply is the final drop. You could try to set the fetchsize to Integer.MIN_VALUE (using setFetchSize() on the statement) to disable fetching the entire resultset.

    1: http://dev.mysql.com/doc/refman/5.0/en/connector-j-reference-implementation-notes.html (under ResultSet)
    while each recommendation helped, this is invocation is what let my program run until every last row was analyzed:

    Statement stmt = conn.createStatement(ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY);
    stmt. setFetchSize (Integer.MIN_VALUE); // <-- problem solved.

    I found a few more characters of interest, and this will save me a lot of headache later on.

    thanks for the all the help.
  • 10. Re: OutOfMemoryError while iterating through a ResultSet
    jschellSomeoneStoleMyAlias Expert
    Currently Being Moderated
    azjp wrote:
    NOTE
    I "think" I might have the answer. There must be metadata stored with each fetch. I am going to do a run using: "+clearWarnings()+."
    This test will take about 5-minutes....

    I am:
    (1) reading in one column per row from a mysql table.
    (2) the column is a utf-8 encoded String.
    (3) the String can be upto 256 characters long.
    (4) I have about 80,000 rows to process.
    If the integer value of a character falls within a range, I store it in an output table;
    String (256 chars) x 80,000 = (512+8) * 80,000 = 41,600,000.

    Only way you can run out of memory with that is if you are in fact doing something else or you are running on java 1.5 or even 1.4 and have not increased the max memory.

    And you are reading strings, process them, and then store the result in another database table?
    Why do you not just use a stored proc?
    Presumably there is a reason it is two databases, but even so a proc could return only the results (presuming MySQL doesn't support cross database support.)
  • 11. Re: OutOfMemoryError while iterating through a ResultSet
    860579 Newbie
    Currently Being Moderated
    jschell wrote:
    azjp wrote:
    NOTE
    I "think" I might have the answer. There must be metadata stored with each fetch. I am going to do a run using: "+clearWarnings()+."
    This test will take about 5-minutes....

    I am:
    (1) reading in one column per row from a mysql table.
    (2) the column is a utf-8 encoded String.
    (3) the String can be upto 256 characters long.
    (4) I have about 80,000 rows to process.
    If the integer value of a character falls within a range, I store it in an output table;
    String (256 chars) x 80,000 = (512+8) * 80,000 = 41,600,000.

    Only way you can run out of memory with that is if you are in fact doing something else or you are running on java 1.5 or even 1.4 and have not increased the max memory.
    I am using the client VM 1.6.0.
    The only thing I do is from the main() method invoke one static method.

    My objective is just to bring order to a massive amount of text.
    I normally use info in one table to seed a second table with keys.
    Then, I fill-in the remaining columns in the seeded table with info read from text files and/or data in other tables.
    And you are reading strings, process them, and then store the result in another database table?
    Why do you not just use a stored proc?
    Presumably there is a reason it is two databases, but even so a proc could return only the results (presuming MySQL doesn't support cross database support.)
    I am even less familiar with stored procedures than I am Java.
    I have one basic method.
    It runs once, grabs data, filters data, populates part of a table;
    Alter the method, run it once, grab data, filter data, popular part of table.
    etc.

    Once my data is finally organized, I can try oop.
  • 12. Re: OutOfMemoryError while iterating through a ResultSet
    802889 Explorer
    Currently Being Moderated
    azjp wrote:
    TheAvalanche wrote:
    MySQL by default fetches the full resultset[1], so most likely you are already almost at the end of available memory, and the insert simply is the final drop. You could try to set the fetchsize to Integer.MIN_VALUE (using setFetchSize() on the statement) to disable fetching the entire resultset.

    1: http://dev.mysql.com/doc/refman/5.0/en/connector-j-reference-implementation-notes.html (under ResultSet)
    while each recommendation helped, this is invocation is what let my program run until every last row was analyzed:

    Statement stmt = conn.createStatement(ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY);
    stmt. setFetchSize (Integer.MIN_VALUE); // <-- problem solved.
    Ah, I actually thought that was the default resultset type, so I didn't think it was necessary to include it in my reply. Sorry :)
  • 13. Re: OutOfMemoryError while iterating through a ResultSet
    jschellSomeoneStoleMyAlias Expert
    Currently Being Moderated
    azjp wrote:
    String (256 chars) x 80,000 = (512+8) * 80,000 = 41,600,000.

    Only way you can run out of memory with that is if you are in fact doing something else or you are running on java 1.5 or even 1.4 and have not increased the max memory.
    I am using the client VM 1.6.0.
    The only thing I do is from the main() method invoke one static method.

    My objective is just to bring order to a massive amount of text.
    That isn't "massive".

    It is very little memory on a modern machine.
    From what you have stated here it is NOT possiible for you to run out of memory.

    So either something you stated here is incorrect or you are doing something else which is significantly impacting your memory.
  • 14. Re: OutOfMemoryError while iterating through a ResultSet
    860579 Newbie
    Currently Being Moderated
    jschell wrote:
    azjp wrote:
    String (256 chars) x 80,000 = (512+8) * 80,000 = 41,600,000.

    Only way you can run out of memory with that is if you are in fact doing something else or you are running on java 1.5 or even 1.4 and have not increased the max memory.
    I am using the client VM 1.6.0.
    The only thing I do is from the main() method invoke one static method.

    My objective is just to bring order to a massive amount of text.
    That isn't "massive".

    It is very little memory on a modern machine.
    From what you have stated here it is NOT possiible for you to run out of memory.

    So either something you stated here is incorrect or you are doing something else which is significantly impacting your memory.
    When I added even more rows (200-k) to analyze, the program crashed after 280,000 key collisions with the same error:
    Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
    at alpha.Main.filterCharacters(Main.java:42)
    at alpha.Main.main(Main.java:10)

    public static void main(String[] args) {
            try {
                Class.forName("com.mysql.jdbc.Driver");
                filterCharacters();
            } catch (Exception e) {
                e.printStackTrace();
            }
        }
    
    
        static void filterCharacters() throws Exception {
            Set<Character> db = new HashSet<Character>();
            Connection connQuery  = DriverManager.getConnection("jdbc:mysql://localhost/alpha", "root", "xxx");
            Connection connInsert = DriverManager.getConnection("jdbc:mysql://localhost/beta", "root", "xxx");
    
            // Statement stmt = connQuery.createStatement(ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY);
            // stmt.setFetchSize(Integer.MIN_VALUE);
            Statement stmt = connQuery.createStatement();
    
            ResultSet rslt = stmt.executeQuery("SELECT longstring FROM document");
    
            PreparedStatement ps   = connInsert.prepareStatement("INSERT INTO results(character) VALUES(?)");
            while(rslt.next()) {
                String def = rslt.getString(1);
                for(int i = 0; i < def.toCharArray().length; i++) {
                    char c = def.charAt(i);
                    if(db.contains(Character.valueOf(c))) continue; // <--- prevent the primary key collisions
                    if(Util.isInteresting(c)) {
                        db.add(Character.valueOf(c));
                        ps.setString(1, String.valueOf(c));
                        try {
                            ps.executeUpdate();  // <--- line #42
                        } catch(SQLException e) {
                            System.out.println("SQLException: " + e.getMessage());
                        }
                    }
                    ps.clearWarnings();
                    connInsert.clearWarnings();
                }
            }
        }   
    }
    This is what I get in the "catch" block for when I insert and get a key collision:
    .....
    SQLException: Duplicate entry '現' for key 'PRIMARY'
    SQLException: Duplicate entry '象' for key 'PRIMARY'
    SQLException: Duplicate entry '出' for key 'PRIMARY'
    .....

    This has Asian characters, so maybe they will not display between the single quotes.

    When I use the Set<Character> collection to stop key collisions, the program finishes no matter how many more rows (tested out to 200-k rows). So, the weird thing to me is that:
    "ps.clearWarnings()"
    "connInsert.clearWarnings()"
    helps, but does not prevent the eventual "out of memory error".
    I see that "SQLException" is not a "warning". But, when an Exception is thrown, does it not expire once the catch block is exited? Key collisions are doing "something" to consume memory, but I think its not warnings metadata or exception objects. I am sure the NetBeans profiler can tell me this. But I have not learned that tool yet, but I am studying.

    Also, to me, allowing the database to prevent multiple key entries makes perfect sense. I mean, before inserting into a table, I can't imagine I should query the table to see if the key already exists. Each insertion must be preceded by a query? And, in future document filtering, almost all the characters (the most frequently used characters) will already be in the "results" table. In a 100-k line document, I might not get more than 10 new characters. I will easily get hundreds of thousands of collisions (unless I query first and then do not insert if the key already exists. I can't think of a good way to avoid key collisions. Or, I can allow for the collision but just don't want to risk running out of memory.

    ok.
    after a little sleep, i realized i can just suppress mysql telling me about the key collisions as such:
    "INSERT IGNORE INTO...."

    Still, I don't see why the memory is consumed because of the key collisions.
    But, I am working on understanding to profiler in NetBeans today, so (I think) I will see the answer clearly.
    Edited by: azjp on May 23, 2011 10:27 PM

    Edited by: azjp on May 24, 2011 3:57 AM
1 2 Previous Next

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points