Forum Stats

  • 3,769,970 Users
  • 2,253,039 Discussions
  • 7,875,254 Comments

Discussions

OutOfMemoryException when i do data insertion continuously

531207
531207 Member Posts: 11
edited Sep 7, 2006 10:20PM in Berkeley DB Java Edition
Hi,
I'm using Berkeley db to store a huge amount of data. Each piece of data from source will be stored into five Entities(as DPL defined). four of those only contains small size of data and the rest one is a little bigger than the others. The total size of one piece of source data will not reach 2k.

I wrote a server program to deal with the data storage process. The data from source are changed into entity objects and stored to BDB by the server continually. It takes about an average of 3-5 seconds to store 1000 piece of source data.

The problem is: when the data reached the amount of about 76000, i got the OutOfMemoryException(The maxMemory is 1G), the size of jdb files is about 400M at that time.
I need to run the server 7*24, and the max amount of source data could be 10,000,000 at max. So this problem bothered me a lot.

By the way i used transaction in my program. the data should be able to be deleted or updated during runtime. and the query frequence is quite high.
I used only one Environment and one EntityStore instance in my server app just like the example in %JEHOME%/examples/persist/gettingStarted

Another thing is I tried to decrease the cache size from 60% to 40%. That seems have some effect, but not much.

My question is: is that because i used Berkeley Db in a wrong way? If so, how could i got some way to solve this problem? Thank u very much if anyone could help.

Best
Richie

Comments

  • Richie,

    Is it possible that you are doing all your insertions within a single transaction?

    JE databases are implemented as btrees. As you grow the database, all nodes will not be able to fit into cache, and JE will evict those portions to disk. When this happens, JE will become slower, but will be able to continue on, so that a fixed size cache can handle an arbitrarily large database.

    There are some internal objects that cannot be paged out in this manner. Transactions are one such item, because they maintain per-record locks. A transaction that has written many records will begin to consume memory that can only be released when the transaction ends. One workaround is to structure your application to avoid transactions than span many write operations.

    Another fact that may apply is that Cursor or EntityCuror objects can prohibit cache eviction. A cursor on a data record will pin that record into the cache, and many open cursors can undercut JE's attempt to manage the cache.

    Probably the best step is to read the information here on cache management: http://www.sleepycat.com/jedocs/GettingStartedGuide/cachesize.html and to look at environment statistics to determine what the cache is doing.

    Regards,

    Linda
  • 531207
    531207 Member Posts: 11
    edited Sep 6, 2006 1:39AM
    Hi,Linda
    Thanks for your reply. After taking some experiment and modifying the cache size, the situation is much better now, thougt still not perfect(the memory still gets lower and lower slowly).

    I'm not doing all my insertions within a single transaction for sure, but i do made insertion of every one piece of source data by using one transaction, which actually has about 5 to 20 Entity write operations. But i don't think that's the point and i have to do this for ACI feature(actually if not, that will cause a serious system failure,but if there is some other way to keep the integrality of data, please let me know). I'm sure i have commited the transaction after every source data insertion.
    About the Cursor i have not used the Object during write process.

    Anyway, your suggestion is quite helpful. I'll check my application one more thime. Thank u very much.

    Regards,
    Richie
  • Charles Lamb
    Charles Lamb Member Posts: 836
    Hi Richie,

    Please check out this blog entry and let me know if it helps at all.

    Thanks.

    Charles Lamb
  • 531207
    531207 Member Posts: 11
    edited Sep 7, 2006 10:20PM
    Hi Charles,
    I run my application under Java jre1.5.0_06.
    My solution is that since the memory could be recoved after a Environment close operation, i force my server re-open the Db environment after a certain times of update operation. After reszied the cache size my situation gots much better so that could work fine.

    Besides i used a nornal PC for testing so the performance could be a problem. i think i should have a further test on a work station, but I don't have one these days -_-!. The perormance of my app is still not satisfying and the query is quite slow, so i still got lots of work to do. Unfortunatly I have some other work need to be done as soon as possible so I have to delay this for some time. Thank you all for helping me so much. I learned quite a lot from u and had lots of fun. I would like to study Berkeley Db more and try to solve my problem in a better way right after i finished my work in hand.

    Regards,
    Richie
This discussion has been closed.