7 Replies Latest reply: Jun 19, 2014 6:33 AM by kiwiclive RSS

    StoredMaps vs Cursors

    kiwiclive

      Hi Guys,

       

      I've been working with the CollectionsAPI which provides a very nice interface with Java. However, we are running into size issues.

       

      Am I right in thinking that when you create a StoredMap for reading from a database, that the *whole* database is read into the map, and it is not lazily loaded as needed ?

      If this is the case, the access time will be slower as the database grows. I have not been able to find a way to enable lazy-loading, or to be able to limit the amount of data loaded into the map at one time. Am I also right in thinking the iterator interface is for iterating through the map once it is loaded, not as a kind of cursor into the underlying DB to load up as many records as you need into the map (my original understanding)?

       

      If the above is true, then it may necessary to drop back to the baseAPI and use cursors. So if I ask the cursor to fetch the first 100 records from a 10M record database, is this 100 i/o operations as items get loaded into the cache or (as the keys are sorted), is there some way of doing this in a block.

       

      Once again, apologies for the dumb questions but I'm a little unclear on these areas.

       

      Thanks,
      Clive

        • 1. Re: StoredMaps vs Cursors
          Greybird-Oracle

          Clive,


          A StoredMap does not store anything in memory.  Each operation retrieves the record from the underlying Database.


          If you get an Iterator from the map, the iterator object does cache a number of records.  See StoredCollection.iterator and storedIterator for more information.


          --mark

          • 2. Re: StoredMaps vs Cursors
            kiwiclive

            Hi Mark,

             

            Thanks for the response. What I was really aiming at was getting data from the DB into memory. So I presume this is the cache as opposed to the StoredMap objects themselves.

             

            My issue is that the StoredMap constructor seems to load the whole database (table) into memory and then I can use an iterator to step through the results. If I have 3 million records in a database, I don't necessarily want to load them all initially, as many are not relevant and for very large databases, the delay during the load is too long. Unfortunately, it does not seem possible to create an 'empty' StoredMap and then call an iterator to load a subset of the data into the cache.

             

            I've managed to achieve what I need using cursors although this meant falling base to the baseAPI, so all is good.

             

            I hope I'm making sense !

             

            Clive

            • 3. Re: StoredMaps vs Cursors
              Greybird-Oracle

              Clive,

               

              The StoredMap constructor absolutely does not read any records, it does not load data into memory.  There is no difference between using a StoredMap and the lower level Database API directly, in this respect.

               

              So there must be something else about your program that is different when you switched to using the Database API directly, or perhaps you're misinterpreting something you're seeing.

               

              --mark

              • 4. Re: StoredMaps vs Cursors
                kiwiclive

                ok, thanks Mark.

                 

                Let me get a couple of example together and see where I am going wrong.

                 

                Clive

                • 5. Re: StoredMaps vs Cursors
                  kiwiclive

                  Hi Mark,

                   

                  Ok, I admit it, I'm an idiot.  I was being hit by two different issues here.

                   

                  (1) When in the IDE and single-stepping through code, after creating the StoredMap, the IDE must call an iterator to fetch all the records, thereby giving the impression that the constructor was pulling in all the data.

                   

                  (2) Then in the live code, I was doing this:

                  StoredMap records = new StoredMap(db, keyBinding, valueBinding, false);

                  log.debug(records.size())  <-- Again causing the iteration through the records !!

                   

                  So thank-you for clearing up my misconceptions, I'm happily back in the world of the collectionAPI and its doing what I expect of it. I was just not using it properly.

                   

                  Please accept my heartfelt thanks :-)

                  Clive

                  • 6. Re: StoredMaps vs Cursors
                    Greybird-Oracle

                    Clive,

                     

                    You're welcome, and I'm glad to hear you figured it out.

                     

                    Yeah, the performance of the size() method (Database.count()) can be unexpected, even thought it's documented to be expensive.  You're not the first person to encounter that issue.

                     

                    Was the IDE calling toString() for the map, just a result of stepping?  Which IDE?

                     

                    We probably should change our toString method to only show the first N records.

                     

                    --mark

                    • 7. Re: StoredMaps vs Cursors
                      kiwiclive

                      Hi Mark,

                       

                      I'm using intellij.

                       

                      What caused me the issue was the application was logging at INFO level so the debug line was not being printed, so I was not even looking there. However, log4j still does the calculation regardless unless you wrap the log line with a loglevel check first.

                       

                      Cheers,
                      Clive