Forum Stats

  • 3,769,982 Users
  • 2,253,043 Discussions
  • 7,875,258 Comments

Discussions

StoredMaps vs Cursors

kiwiclive
kiwiclive Member Posts: 21
edited Jun 19, 2014 7:33AM in Berkeley DB Java Edition

Hi Guys,

I've been working with the CollectionsAPI which provides a very nice interface with Java. However, we are running into size issues.

Am I right in thinking that when you create a StoredMap for reading from a database, that the *whole* database is read into the map, and it is not lazily loaded as needed ?

If this is the case, the access time will be slower as the database grows. I have not been able to find a way to enable lazy-loading, or to be able to limit the amount of data loaded into the map at one time. Am I also right in thinking the iterator interface is for iterating through the map once it is loaded, not as a kind of cursor into the underlying DB to load up as many records as you need into the map (my original understanding)?

If the above is true, then it may necessary to drop back to the baseAPI and use cursors. So if I ask the cursor to fetch the first 100 records from a 10M record database, is this 100 i/o operations as items get loaded into the cache or (as the keys are sorted), is there some way of doing this in a block.

Once again, apologies for the dumb questions but I'm a little unclear on these areas.

Thanks,
Clive

Best Answer

  • Greybird-Oracle
    Greybird-Oracle Member Posts: 2,690
    Accepted Answer

    Clive,

    The StoredMap constructor absolutely does not read any records, it does not load data into memory.  There is no difference between using a StoredMap and the lower level Database API directly, in this respect.

    So there must be something else about your program that is different when you switched to using the Database API directly, or perhaps you're misinterpreting something you're seeing.

    --mark

Answers

  • Greybird-Oracle
    Greybird-Oracle Member Posts: 2,690
    edited Jun 12, 2014 7:55PM

    Clive,


    A StoredMap does not store anything in memory.  Each operation retrieves the record from the underlying Database.


    If you get an Iterator from the map, the iterator object does cache a number of records.  See StoredCollection.iterator and storedIterator for more information.


    --mark

    Greybird-Oracle
  • kiwiclive
    kiwiclive Member Posts: 21

    Hi Mark,

    Thanks for the response. What I was really aiming at was getting data from the DB into memory. So I presume this is the cache as opposed to the StoredMap objects themselves.

    My issue is that the StoredMap constructor seems to load the whole database (table) into memory and then I can use an iterator to step through the results. If I have 3 million records in a database, I don't necessarily want to load them all initially, as many are not relevant and for very large databases, the delay during the load is too long. Unfortunately, it does not seem possible to create an 'empty' StoredMap and then call an iterator to load a subset of the data into the cache.

    I've managed to achieve what I need using cursors although this meant falling base to the baseAPI, so all is good.

    I hope I'm making sense !

    Clive

  • Greybird-Oracle
    Greybird-Oracle Member Posts: 2,690
    Accepted Answer

    Clive,

    The StoredMap constructor absolutely does not read any records, it does not load data into memory.  There is no difference between using a StoredMap and the lower level Database API directly, in this respect.

    So there must be something else about your program that is different when you switched to using the Database API directly, or perhaps you're misinterpreting something you're seeing.

    --mark

  • kiwiclive
    kiwiclive Member Posts: 21

    ok, thanks Mark.

    Let me get a couple of example together and see where I am going wrong.

    Clive

  • kiwiclive
    kiwiclive Member Posts: 21

    Hi Mark,

    Ok, I admit it, I'm an idiot.  I was being hit by two different issues here.

    (1) When in the IDE and single-stepping through code, after creating the StoredMap, the IDE must call an iterator to fetch all the records, thereby giving the impression that the constructor was pulling in all the data.

    (2) Then in the live code, I was doing this:

    StoredMap records = new StoredMap(db, keyBinding, valueBinding, false);

    log.debug(records.size())  <-- Again causing the iteration through the records !!

    So thank-you for clearing up my misconceptions, I'm happily back in the world of the collectionAPI and its doing what I expect of it. I was just not using it properly.

    Please accept my heartfelt thanks :-)

    Clive

  • Greybird-Oracle
    Greybird-Oracle Member Posts: 2,690

    Clive,

    You're welcome, and I'm glad to hear you figured it out.

    Yeah, the performance of the size() method (Database.count()) can be unexpected, even thought it's documented to be expensive.  You're not the first person to encounter that issue.

    Was the IDE calling toString() for the map, just a result of stepping?  Which IDE?

    We probably should change our toString method to only show the first N records.

    --mark

  • kiwiclive
    kiwiclive Member Posts: 21

    Hi Mark,

    I'm using intellij.

    What caused me the issue was the application was logging at INFO level so the debug line was not being printed, so I was not even looking there. However, log4j still does the calculation regardless unless you wrap the log line with a loglevel check first.

    Cheers,
    Clive

This discussion has been closed.