1 person found this helpful
Db.DB_APPEND does not append to the end of a record in the standard edition, I think you misunderstood what it is. In any case, appending of any sort is not supported in BDB JE.
If you're seeing performance problems, it may be because you're using read-modify-write to maintain a list in a single record. This is not recommended for optimal performance, because each update adds a new log entry in JE's append-only storage system.
Instead, try inserting a separate record for each list entry.
1 person found this helpful
Thanks a lot Mark. But, isn't there a motivation to have appending feature? As in the above mentioned case, to handle a read request, all the key-value pairs have to be read and it may take a lot of time. Also, inserting separate record for each list entry may not be able to provide "locality property" as the data may not be stored consecutively. Is there any database which provides a way to store link-list directly as a value and update a part?
BDB-JE is based on what is called the "log-structured-file" architecture. This does not allow for any in-place updates of data; every update of a record creates a new record version, which is appended at the end of a log file (an internal "cleaner" module is responsible for deleting older versions to reclaim disk space). There are pros and cons to this architecture, but it is what it is.
For your case, the best approach may be to store the list items as separate records in a "duplicates" DB. Records in a duplicates DB have, conceptually, 2 keys, a primary and a secondary key, and they are sorted first by their primary key and then by their secondary key. Furthermore, the whole records are stored in the leaf nodes of a BTree. So, consecutive records are stored physically next to each other as long as they fit within the same BTree node. This provides a degree of locality. In your case, the primary key would be the list id, and assuming that the lists are append-only, the secondary key could be the position of the record within the list. Notice that secondary keys are not explicitly exposed at the API. You declare them implicitly by providing a "comparator" class to compare the value portions of the (primary) key/value pairs.
Regarding other databases, many of the more traditional RDBMs provide "large-object" (LOB) support, which I believe, can do what you are asking for (of course, internally, the LOBs may also be broken-up in several chunks). I do not know if any of the newer NoSQL databases offer such capabilities.
Thanks.. It is very useful
I'll add one thing about the approach that Markos described. Duplicate databases, while they may be useful for this case, have one limitation to keep in mind: You can create a secondary index on the records in the duplicate database. This may not be important to you, but I thought I should mention it.
Is there a way to directly access the last duplicate record for a particular key? One way is to use "cursor.getSearchKey() " and reach the first entry and from there, we can iterate for "cursor.count()" number of times using "cursor.getNextDup()" with LockMode.READ_UNCOMMITTED. But this takes time as the number of duplicate records increase.
The best technique for this is to create a "successor" key for the main key (not the duplicate key) you're looking for. So let's say you have records like this in a duplicate DB:
key: a data: 1
key: a data: 99999
key: c data: 1
key: c data: 99999
If you want to move a cursor to the last record having key a, first create the successor key to a. The successor is defined as the smallest possible key that is greater. For integer keys, calculating the successor is simpler, you just add one. For a variable length string key, the successor can be created by appending a zero character , or "a\u0000" in this example, since zero sorts before all other characters.
Position on the first duplicate for the successor key using getSearchKeyRange, and then call getPrev to move to the last duplicate of the proceeding key (a in this case).
Unfortunately, this isn't quite enough. In the example, getSearchKeyRange will position to the first duplicate for key c. After doing this, another transaction may insert a record with key b. In that case getPrev will move to the newly inserted record, which is not what you want. Therefore, you should create a loop calling getPrev and comparing the key returned to a. The last duplicate for a is found when a key equal to a is found. It's possible of course that there is no record with key a, and the loop should exit when the key found is less than or equal to a, or NOTFOUND is returned by getPrev.
Note that the loop is not necessary if you're using Serializable isolation (see TransactionConfig.setSerializableIsolation), since this isolation mode prevents other threads from inserting records in a key range you've read. However, for best performance this mode is not recommended, since it requires extra internal searching and locking.
Also note that the same approach applies if you're not using duplicates, when you have a multi-part main key.
One final note: If you're concerned about performance and you're using duplicates, be aware that Cursor.count() iterates through the records. So there is no advantage to calling count() and then iterating that many times -- this doubles the work. Plus, it's not an accurate way to find the last duplicate, since another thread can insert a duplicate ahead of the cursor position while you're iterating.
Thanks for the reply. If I understand correctly, in reference to the above example, does the successor key for "a" i.e "a\u0000" will be a synonym for key
"c"? But my concern is if the clients inserting in the database are inserting independently for every new key. It has been taken care that the keys are unique
but somewhat unrelated.
I think this similar thing can be done as:
Go to the first duplicate entry using "getSearchKey()", then getNextNoDup() will make the cursor to point to next key in the database. Then getPrev() can be used to point to the last duplicate entry of the previous key.
Also, in case getNextNoDup() results in NOTFOUND, then it indicates that no other key greater than current key exist. Hence, getPrev() will point to the last entry of the database, which is the resultant required key/value pair.
It will be great if you can confirm whether it is correct or not?
"a\u0000" is not a synonym for key "c", it is a key greater than "a" and less than or equal to all keys greater than "a". It is the first possible key that follows "a", in other words, the "successor".
When the successor is passed to getSearchKeyRange, which returns a key greater or equal to its parameter, this method will return whatever key follows "a" in the database ("c" in the example).
Yes, you are correct, getNextNoDup can be used in a similar way. Also in this case when calling getPrev you need to do it in a loop in case another thread inserts a record between "a" and "c", after the cursor has moved to "c", like I described earlier.
I got it Thanks a lot Mark.. Ya, I will take care of the 'loop case'