This discussion is archived
14 Replies Latest reply: Sep 27, 2013 2:21 PM by 501863e4-e8fa-4a1a-bbd6-0d2c730c3f67 RSS

Secondary Database

501863e4-e8fa-4a1a-bbd6-0d2c730c3f67 Newbie
Currently Being Moderated

Hi

 

Im currently trying to understand the SecondaryDatabase and SecondaryMultiKeyCreator and have a few question.

 

1) Creating a relationship from a put operation is pretty straightforward. But what am I expected to do when SecondaryMultiKeyCreator is called during a delete operation? I was expecting that this invocation would NOT occur at all, i.e. deletes are handled automatically?

 

2) When a key/value is updated, secondary keys might need to be updated aswell. Some might be added, some removed etc, but I cant find an obvious way to do this except using SecondaryDatabase directly/manually?

 

3) In general, I cant see how to distinguish different type operations (create/delete/update) from within SecondaryMultiKeyCreator?

 

4) How do I fetch a secondary relationship? Do I query the SecondaryDatabase manually or are there any convenient mechanism that handle this?

 

5) Are there any real difference between using a SecondaryDatabase/SecondaryMultiKeyCreator then maintaining such a relationship manually using just another Database? Provided that this realtionship is transactional etc.

 

 

Cheers,

-Kristoffer

  • 1. Re: Secondary Database
    501863e4-e8fa-4a1a-bbd6-0d2c730c3f67 Newbie
    Currently Being Moderated

    Sorry, im only using the Base API here.

  • 2. Re: Secondary Database
    Bogdan Coman Journeyer
    Currently Being Moderated

    Hi Kristoffer,

     

    > 1) Creating a relationship from a put operation is pretty straightforward. But what am I expected to do when SecondaryMultiKeyCreator is called during a delete operation? I was expecting that this invocation would NOT occur at all, i.e. deletes are handled automatically?

     

    In general, you can not modify a secondary database directly. In order to modify a secondary database, you should modify the primary database and simply allow JE to manage the secondary modifications for you.

    However, as a convenience, you can delete SecondaryDatabase  records directly. Doing so causes the associated primary key/data pair to be deleted. This in turn causes JE to delete all SecondaryDatabase  records that reference the primary record.

    You can use the SecondaryDatabase.delete()  method to delete a secondary database record. Note that if your database supports duplicate records, then only the first record in the matching duplicates set is deleted by this method. To delete all the duplicate records that use a given key, use a SecondaryCursor

     

    Deleting Secondary Database Records

     

    > 2) When a key/value is updated, secondary keys might need to be updated aswell. Some might be added, some removed etc, but I cant find an obvious way to do this except using SecondaryDatabase directly/manually?

     

    When a primary record is created, modified, or deleted, JE automatically updates the secondary database(s) for you as is appropriate for the operation performed on the primary.

     

    > 4) How do I fetch a secondary relationship? Do I query the SecondaryDatabase manually or are there any convenient mechanism that handle this?

    Like a primary database, you can read records from your secondary database either by using the  SecondaryDatabase.get() method, or by using a SecondaryCursor. The main difference between reading secondary and primary databases is that when you read a secondary database record, the secondary record's data is not returned to you. Instead, the primary key and data corresponding to the secondary key are returned to you.

     

    http://docs.oracle.com/cd/E17277_02/html/GettingStartedGuide/readSecondary.html

     

    > 5) Are there any real difference between using a SecondaryDatabase/SecondaryMultiKeyCreator then maintaining such a relationship manually using just another Database? Provided that this realtionship is transactional etc.

    There's more work on your part. This chapter should answer all your questions about secondaries:http://docs.oracle.com/cd/E17277_02/html/GettingStartedGuide/indexes.html

     

    One more thing I suggest is to check the following je-5.0.xx\examples\je\ToManyExample.java example, it has detailed code on how SecondaryMultiKeyCreator really works.

     

    Let me know if this answers your questions.

     

    Thanks,

    Bogdan

  • 3. Re: Secondary Database
    greybird Expert
    Currently Being Moderated

    >> You can use the SecondaryDatabase.delete()  method to delete a secondary database record. Note that if your database supports duplicate records, then only the first record in the matching duplicates set is deleted by this method. To delete all the duplicate records that use a given key, use a SecondaryCursor.

    >> Deleting Secondary Database Records

     

    What you said matches the documentation above, but unfortunately that documentation is incorrect.  Calling SecondaryDatabase.delete() deletes all primary records that have the secondary key, as described in the javadoc for this method:

    SecondaryDatabase (Oracle - Berkeley DB Java Edition API)

    I will file a ticket to correct the documentation that is in error.

     

    >> Like a primary database, you can read records from your secondary database either by using the  SecondaryDatabase.get() method, or by using a SecondaryCursor. The main difference between reading secondary and primary databases is that when you read a secondary database record, the secondary record's data is not returned to you. Instead, the primary key and data corresponding to the secondary key are returned to you.

     

    That's correct.  Also, there is a method signature for SecondaryDatabase.get() and also SecondaryCursor methods that additionally return the primary key (which is the secondary record's data) via the pKey parameter.

     

    >>>> 5) Are there any real difference between using a SecondaryDatabase/SecondaryMultiKeyCreator then maintaining such a relationship manually using just another Database? Provided that this realtionship is transactional etc.

    >> There's more work on your part.

     

    Right.  In addition, there are features and optimizations (including future optimizations planned) that you may have to re-implement if you don't use the built-in secondaries.  We strongly recommend using the built-in secondaries, unless there is a good reason not to do so.  Have you run across a reason for implementing your own secondaries?

     

    --mark

  • 4. Re: Secondary Database
    501863e4-e8fa-4a1a-bbd6-0d2c730c3f67 Newbie
    Currently Being Moderated

    The real problem I have is that SecondaryMultiKeyCreator is called whenever something changes, no matter if it is a create, delete or update. You need to take different actions depending on the operation, even it means doing nothing, but there is no way of telling what the originating operation was?

     

    I will have a look at the examples also.

  • 5. Re: Secondary Database
    greybird Expert
    Currently Being Moderated

    JE calls your callback for both the new record (on insert and update) and the old record (on update and delete), and it does the right thing for each operation.  Your callback always simply returns the secondary keys contained in the record.  If you'd like to see the algorithm, take a look at SecondaryDatabase.java.

     

    --mark

  • 6. Re: Secondary Database
    501863e4-e8fa-4a1a-bbd6-0d2c730c3f67 Newbie
    Currently Being Moderated

    Ok that's good. I probably messed something up because I get really weird behaviour. Will try again.

     

    Another question. The documentation indicate in several places that records are sorted according to how the BTree comparator sort keys, but this is not what I am seeing. If I put some records randomly into the database (that have deterministic key order) the cursor returns them in random order?

     

    Is this correct?

  • 7. Re: Secondary Database
    greybird Expert
    Currently Being Moderated

    Records are sorted by the comparator and returned in that order by the cursor, this works.  First thing is to check your comparator.

  • 8. Re: Secondary Database
    501863e4-e8fa-4a1a-bbd6-0d2c730c3f67 Newbie
    Currently Being Moderated

    Hmm. My comparator seems to work.

     

            ArrayList<byte[]> list = new ArrayList<>();

            for (int i = 0; i < 10; i++) {

                list.add(new byte[] { (byte) i} );

            }

     

            Collections.shuffle(list);

            for (byte[] key : list) {

                System.out.print(Arrays.toString(key) + " ");

            }

            System.out.println("");

            Collections.sort(list, new FastKeyComparator());

            for (byte[] key : list) {

                System.out.print(Arrays.toString(key) + " ");

            }

     

    Prints:

     

    [2] [4] [7] [5] [3] [8] [6] [1] [0] [9]

    [0] [1] [2] [3] [4] [5] [6] [7] [8] [9]

     

     

    Maybe im missing some other detail?

  • 9. Re: Secondary Database
    501863e4-e8fa-4a1a-bbd6-0d2c730c3f67 Newbie
    Currently Being Moderated

    Im doing something wrong here.

     

    Did a minimum integration test with a single database and the comparator works, but so does the cursor ordering.

     

    Need to do more work on my part to find out what is wrong. Sorry for the confusion.

  • 10. Re: Secondary Database
    greybird Expert
    Currently Being Moderated

    No problem, thanks for letting us know.

  • 11. Re: Secondary Database
    501863e4-e8fa-4a1a-bbd6-0d2c730c3f67 Newbie
    Currently Being Moderated

    Ok I found the problem. Again, sorry for the confusion.

     

    But I have a follow up question that first deserve some explanation.

     

    Im storing data that have fixed length keys and every key is prefixed according to the type of data stored in the value. The reason for this key design is to try having data locality by "line-up" entries of same type between each other, thus allowing more efficient cursor forwarding for particular types of data.

     

    Im not sure if this is a good idea or not? Is it possible to give me some idea of how this would perform in terms of scan performance?

     

    This is much how you would do it in HBase for example, where rows are stored according key order.

  • 12. Re: Secondary Database
    501863e4-e8fa-4a1a-bbd6-0d2c730c3f67 Newbie
    Currently Being Moderated

    I solved the initial SecondaryDatabase "problem" also. Everything works as expected.

     

    Though, I would appreciate your opinion around cursor scan performance of the key design I mentioned earlier.

     

    Cheers and thank you for a truly excellent persistence alternative!

  • 13. Re: Secondary Database
    greybird Expert
    Currently Being Moderated

    Yes, the key design you're talking about will be beneficial for a cursor scan.

     

    The Btree is ordered by key.  When you traverse with a cursor, you only have to fetch one IN (internal node) for every 100 records or so, if you're reading in key order.  However, to give you the full picture, if you're also fetching the data (not just the key), then a fetch for each data record is also needed.  By default the API methods do fetch the data, but you can prevent this (if you only need the key) by calling DatabaseEntry.setPartial(0, 0, true).

     

    Anyway, the bottom line is that it is beneficial when reading multiple records to read in key order, and to design your keys so that is possible.

     

    --mark

  • 14. Re: Secondary Database
    501863e4-e8fa-4a1a-bbd6-0d2c730c3f67 Newbie
    Currently Being Moderated

    Thanks, great news. Appreciate your elaborate explanation/advice about the internals.

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points