This discussion is archived
4 Replies Latest reply: Aug 8, 2008 4:45 PM by greybird RSS

Questions on concurrency control

571477 Newbie
Currently Being Moderated
Hi all,

I've tried to run a benchmark that mimics the TPC-B but with the isolation level serializable the number of aborts is extremely high. To understand why I started browsing the Sleepycat's code and I could not understand the following points:

1 - In method Cursor.java:searchExactAndRangeLock, there is
/* The keyChange value is independent of the status value. */
noNextKeyFound = !result.keyChange;
/* Lock the EOF node if no more records, whether or not more dups. */
if (noNextKeyFound) {
cursorImpl.lockEofNode(LockType.RANGE_READ);

Is this right ? I mean every time a "get" is issued the end of file is locked even when the key exists.
Could I replace noNextKeyFound = !result.keyChange; by noNextKeyFound = (result.status != OperationStatus.SUCCESS); ?

2 - Cursor.java:putNoNotify
DatabaseEntry nextKey = new DatabaseEntry
(key.getData(), key.getOffset(), key.getSize());
DatabaseEntry nextData = new DatabaseEntry
(data.getData(), data.getOffset(), data.getSize());
nextKey.setPartial(0, 0, true);
nextData.setPartial(0, 0, true);

nextKeyCursor.lockNextKeyForInsert(key, data, nextKey, nextData);

The value of "nextKey" does not always match the next key available in the database. By the way, I've changed the signature of the method nextKeyCursor.lockNextKeyForInsert to return which was supposed to be next key value.

3 - I did not see how phantoms generated by deletes are avoided.


Cheers.
  • 1. Re: Questions on concurrency control
    greybird Expert
    Currently Being Moderated
    Hi Alfranio,

    It's great that you're doing this type of testing and that you've dug into the source code -- we appreciate it, and I'm sure others will benefit from your work.
    1 - In method Cursor.java:searchExactAndRangeLock,
    there is
    /* The keyChange value is independent of the status
    value. */
    noNextKeyFound = !result.keyChange;
    /* Lock the EOF node if no more records, whether or
    not more dups. */
    if (noNextKeyFound) {
    cursorImpl.lockEofNode(LockType.RANGE_READ);

    Is this right ? I mean every time a "get" is issued
    the end of file is locked even when the key exists.
    Could I replace noNextKeyFound = !result.keyChange;
    by noNextKeyFound = (result.status !=
    OperationStatus.SUCCESS); ?
    I've done a little testing and it looks like there is a problem here. I'm not sure yet whether the fix is as simple as you suggest. I will be digging deeper and adding more cases to our test suite, and I'll get back to you on the results sometime next week.
    2 - Cursor.java:putNoNotify
    DatabaseEntry nextKey = new DatabaseEntry
    (key.getData(), key.getOffset(), key.getSize());
    DatabaseEntry nextData = new DatabaseEntry
    (data.getData(), data.getOffset(), data.getSize());
    nextKey.setPartial(0, 0, true);
    nextData.setPartial(0, 0, true);

    nextKeyCursor.lockNextKeyForInsert(key, data,
    nextKey, nextData);

    The value of "nextKey" does not always match the next
    key available in the database. By the way, I've
    changed the signature of the method
    nextKeyCursor.lockNextKeyForInsert to return which
    was supposed to be next key value.
    Can you give an example of what you're seeing? -- The keys in the database, the key being inserted, and the nextKey that is locked (incorrectly). Or if you have one, please send a test program that shows the problem.
    3 - I did not see how phantoms generated by deletes
    are avoided.
    In JE, a deleted record is left in the Btree until after the transaction ends, and the deleted record remains locked. Other transactions attempting to insert a record with the same key are blocked. This is true no matter what isolation level is used. Does this answer your question?

    Thanks again for contributing your efforts on this.

    --mark                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       
  • 2. Re: Questions on concurrency control
    571477 Newbie
    Currently Being Moderated
    Hi Mark,
    Can you give an example of what you're seeing? --
    The keys in the database, the key being inserted, and
    the nextKey that is locked (incorrectly). Or if you
    have one, please send a test program that shows the
    problem.
    A simple application is sequentially inserting the following keys on behalf of a single transaction: k1, k2, k4 and k5.

    The output of the first run:
    range-01: k1 -- k1
    range-02: k1 -- EOF
    range-01: k2 -- k2
    range-02: k2 -- k1
    range-01: k4 -- k4
    range-02: k4 -- k2
    range-01: k5 -- k5
    range-02: k5 -- k4

    The output of successive runs:
    range-01: k1 --
    range-02: k1 -- k2
    range-01: k2 --
    range-02: k2 -- k4
    range-01: k4 -- k4
    range-02: k4 -- k4
    range-01: k5 -- k5
    range-02: k5 -- k4

    I don't understand these values.. :(

    By the way, I don't know how to attach files by means of the interface provided in the forum, so I am copying the modifications done to the source code. Let me know if you need any else.

    Cheers.

    alfranio

    === modified file 'src/com/sleepycat/je/Cursor.java'
    --- src/com/sleepycat/je/Cursor.java 2008-06-12 16:21:24 +0000
    +++ src/com/sleepycat/je/Cursor.java 2008-08-04 19:08:15 +0000
    @@ -1375,6 +1375,31 @@
    }
    }

    + private CursorImpl getCursor(CursorImpl orig) throws DatabaseException {
    + CursorImpl dup = null;
    +
    + if (orig != null) {
    + dup = orig.cloneCursor(true);
    + }
    + return (dup);
    + }
    +
    + private OperationStatus rangeLockCurrentPosition(DatabaseEntry key, DatabaseEntry data, CursorImpl orig)
    + throws DatabaseException {
    + OperationStatus status = OperationStatus.NOTFOUND;
    + CursorImpl dup = getCursor(orig);
    +
    + if (dup != null) {
    + try {
    + status = dup.getCurrent(key, data, LockType.NONE);
    + } finally {
    + dup.close();
    + }
    + }
    +
    + return (status);
    + }
    +
    /**
    * Performs the put operation but does not notify triggers (does not
    * perform secondary updates). Prevents phantoms.
    @@ -1399,7 +1424,45 @@
    nextKeyCursor = new CursorImpl(dbImpl, nextKeyLocker);
    /* Perform eviction for user cursors. */
    nextKeyCursor.setAllowEviction(true);
    - nextKeyCursor.lockNextKeyForInsert(key, data);
    +
    + DatabaseEntry nextKey = new DatabaseEntry
    + (key.getData(), key.getOffset(), key.getSize());
    + DatabaseEntry nextData = new DatabaseEntry
    + (data.getData(), data.getOffset(), data.getSize());
    + nextKey.setPartial(0, 0, true);
    + nextData.setPartial(0, 0, true);
    +
    + nextKeyCursor.lockNextKeyForInsert(key, data, nextKey, nextData);
    +
    + if (nextKey.getData() == null) {
    + try {
    + System.out.println("range-01: " + new String(key.getData(),"UTF-8") + " -- EOF");
    + } catch (UnsupportedEncodingException e) {
    + e.printStackTrace(System.err);
    + }
    + } else {
    + try {
    + System.out.println("range-01: " + new String(key.getData(),"UTF-8") + " -- " + new String(nextKey.getData(),"UTF-8"));
    + } catch (UnsupportedEncodingException e) {
    + e.printStackTrace(System.err);
    + }
    + }
    +
    + DatabaseEntry endKey = new DatabaseEntry();
    + DatabaseEntry endData = new DatabaseEntry();
    + if (rangeLockCurrentPosition(endKey,endData,nextKeyCursor) != OperationStatus.SUCCESS) {
    + try {
    + System.out.println("range: " + new String(key.getData(),"UTF-8") + " -- EOF");
    + } catch (UnsupportedEncodingException e) {
    + e.printStackTrace(System.err);
    + }
    + } else {
    + try {
    + System.out.println("range: " + new String(key.getData(),"UTF-8") + " -- " + new String(endKey.getData(),"UTF-8"));
    + } catch (UnsupportedEncodingException e) {
    + e.printStackTrace(System.err);
    + }
    + }
    }

    /* Perform the put operation. */

    === modified file 'src/com/sleepycat/je/dbi/CursorImpl.java'
    --- src/com/sleepycat/je/dbi/CursorImpl.java 2008-06-12 16:21:24 +0000
    +++ src/com/sleepycat/je/dbi/CursorImpl.java 2008-08-04 18:28:04 +0000
    @@ -955,15 +955,10 @@
    * records following the given key and datum, lock the special EOF node
    * for the databaseImpl.
    */
    - public void lockNextKeyForInsert(DatabaseEntry key, DatabaseEntry data)
    + public OperationStatus lockNextKeyForInsert(DatabaseEntry key, DatabaseEntry data,
    + DatabaseEntry nextKey, DatabaseEntry nextData)
    throws DatabaseException {
    -
    - DatabaseEntry tempKey = new DatabaseEntry
    - (key.getData(), key.getOffset(), key.getSize());
    - DatabaseEntry tempData = new DatabaseEntry
    - (data.getData(), data.getOffset(), data.getSize());


    - tempKey.setPartial(0, 0, true);
    - tempData.setPartial(0, 0, true);
    + OperationStatus status = OperationStatus.NOTFOUND;
    boolean lockedNextKey = false;

    /* Don't search for data if duplicates are not configured. */
    @@ -988,7 +983,6 @@
    * If we didn't match the key, skip over duplicates to the next
    * key with getNextNoDup.
    */
    - OperationStatus status;
    if ((searchResult & EXACT_KEY) != 0) {
    status = getNext
    (tempKey, tempData, LockType.RANGE_INSERT, true, true);
  • 3. Re: Questions on concurrency control
    greybird Expert
    Currently Being Moderated
    1 - In method Cursor.java:searchExactAndRangeLock,
    there is
    /* The keyChange value is independent of the
    status
    value. */
    noNextKeyFound = !result.keyChange;
    /* Lock the EOF node if no more records, whether
    or
    not more dups. */
    if (noNextKeyFound) {
    cursorImpl.lockEofNode(LockType.RANGE_READ);

    Is this right ? I mean every time a "get" is
    issued
    the end of file is locked even when the key
    exists.
    Could I replace noNextKeyFound =
    !result.keyChange;
    by noNextKeyFound = (result.status !=
    OperationStatus.SUCCESS); ?
    I've done a little testing and it looks like there is
    a problem here. I'm not sure yet whether the fix is
    as simple as you suggest. I will be digging deeper
    and adding more cases to our test suite, and I'll get
    back to you on the results sometime next week.
    Alfranio, it turns out that you are exactly correct. This is a bug that reduces concurrency for Serializable isolation, and the problem code is exactly what you have pointed out.

    We've added to our test suite to thoroughly cover this issue and potential related issues, and we've added a fix that will be in our next JE 3.3 patch release. For now, I'll copy the patch itself below.

    Next I'll take a look at your issue (2) and get back to on that in the next few days. When that is resolved, I'll make a JE jar with these fixes available on request.

    Thanks again!
    --mark
    Index: src/com/sleepycat/je/Cursor.java
    ===================================================================
    RCS file: /a/CVSROOT/je/src/com/sleepycat/je/Cursor.java,v
    retrieving revision 1.216
    diff -c -w -r1.216 Cursor.java
    *** src/com/sleepycat/je/Cursor.java    10 Jun 2008 02:52:08 -0000      1.216
    --- src/com/sleepycat/je/Cursor.java    7 Aug 2008 01:00:22 -0000
    ***************
    *** 1698,1704 ****
                  SearchMode.SET_RANGE : SearchMode.BOTH_RANGE;
    
              KeyChangeStatus result = null;
    -         boolean noNextKeyFound;
    
              CursorImpl dup =
                  beginRead(false /* searchAndPosition will add cursor */);
    --- 1698,1703 ----
    ***************
    *** 1714,1722 ****
                      (dup, key, data, searchLockType, advanceLockType, searchMode,
                       true /*advanceAfterRangeSearch*/);
      
    -             /* The keyChange value is independent of the status value. */
    -             noNextKeyFound = !result.keyChange;
    - 
                  /* If the key changed, then we do not have an exact match. */
                  if (result.keyChange && result.status == OperationStatus.SUCCESS) {
                      result.status = OperationStatus.NOTFOUND;
    --- 1713,1718 ----
    ***************
    *** 1726,1733 ****
                               result.status == OperationStatus.SUCCESS);
              }
    
    !         /* Lock the EOF node if no more records, whether or not more dups. */
    !         if (noNextKeyFound) {
                  cursorImpl.lockEofNode(LockType.RANGE_READ);
              }
    
    --- 1722,1732 ----
                               result.status == OperationStatus.SUCCESS);
              }
    
    !         /*
    !          * Lock the EOF node if there was no exact match and we did not
    !          * range-lock the next record.
    !          */
    !         if (result.status != OperationStatus.SUCCESS && !result.keyChange) {
                  cursorImpl.lockEofNode(LockType.RANGE_READ);
              }
  • 4. Re: Questions on concurrency control
    greybird Expert
    Currently Being Moderated
    Alfranio, on your issue (2), I recreated the situation you described and everything looks OK -- I don't see a problem.

    In the code you added, the wrong results were printed. First, if CursorImpl.lockNextKeyForInsert does not successfully lock the next key, then it is invalid to print out the contents of the DatabaseEntry params -- these are only meaningful if the operation succeeds. Second, if you call setPartial, the entry won't be returned, which is why you were seeing a null data array in some cases.

    I changed your range-01 code so it works with CursorImpl. And I deleted your range-02 code (actually this was just called "range" in the diff you sent, I don't think you sent the diff you were running with). The results are exactly as you would except.

    The initial run prints:

    range: k1 -- EOF
    range: k2 -- EOF
    range: k4 -- EOF
    range: k5 -- EOF

    And subsequent runs print:

    range: k1 -- k2
    range: k2 -- k4
    range: k4 -- k5
    range: k5 -- EOF

    Below is the diff I used, which I believe is equivalent to what your code was attempting. Below that is the test program I used.

    Please let me know whether this answers your questions and whether you agree that this is correct.
    Thanks,
    --mark
    Index: src/com/sleepycat/je/Cursor.java
    ===================================================================
    RCS file: /a/CVSROOT/je/src/com/sleepycat/je/Cursor.java,v
    retrieving revision 1.216.2.1
    diff -c -w -r1.216.2.1 Cursor.java
    *** src/com/sleepycat/je/Cursor.java    7 Aug 2008 17:04:46 -0000       1.216.2.1
    --- src/com/sleepycat/je/Cursor.java    8 Aug 2008 23:31:41 -0000
    ***************
    *** 11,16 ****
    --- 11,17 ----
      import java.util.logging.Level;
      import java.util.logging.Logger;
    
    + import com.sleepycat.bind.tuple.StringBinding;
      import com.sleepycat.je.dbi.CursorImpl;
      import com.sleepycat.je.dbi.DatabaseImpl;
      import com.sleepycat.je.dbi.GetMode;
    ***************
    *** 1399,1405 ****
                      nextKeyCursor = new CursorImpl(dbImpl, nextKeyLocker);
                      /* Perform eviction for user cursors. */
                      nextKeyCursor.setAllowEviction(true);
    !                 nextKeyCursor.lockNextKeyForInsert(key, data);
                  }
    
                  /* Perform the put operation. */
    --- 1400,1421 ----
                      nextKeyCursor = new CursorImpl(dbImpl, nextKeyLocker);
                      /* Perform eviction for user cursors. */
                      nextKeyCursor.setAllowEviction(true);
    ! 
    !                 DatabaseEntry nextKey = new DatabaseEntry
    !                        (key.getData(), key.getOffset(), key.getSize());
    !                 DatabaseEntry nextData = new DatabaseEntry
    !                        (data.getData(), data.getOffset(), data.getSize());
    ! 
    !                 if (nextKeyCursor.lockNextKeyForInsert
    !                     (key, data, nextKey, nextData)) {
    !                     System.out.println
    !                         ("range: " + StringBinding.entryToString(key) +
    !                          " -- " + StringBinding.entryToString(nextKey));
    !                 } else {
    !                     System.out.println
    !                         ("range: " + StringBinding.entryToString(key) +
    !                          " -- EOF");
    !                 }
                  }
    
                  /* Perform the put operation. */
    Index: src/com/sleepycat/je/dbi/CursorImpl.java
    ===================================================================
    RCS file: /a/CVSROOT/je/src/com/sleepycat/je/dbi/CursorImpl.java,v
    retrieving revision 1.348
    diff -c -w -r1.348 CursorImpl.java
    *** src/com/sleepycat/je/dbi/CursorImpl.java    20 May 2008 03:27:34 -0000      1.348
    --- src/com/sleepycat/je/dbi/CursorImpl.java    8 Aug 2008 23:31:44 -0000
    ***************
    *** 955,969 ****
           * records following the given key and datum, lock the special EOF node
           * for the databaseImpl.
           */
    !     public void lockNextKeyForInsert(DatabaseEntry key, DatabaseEntry data)
              throws DatabaseException {
    
    -         DatabaseEntry tempKey = new DatabaseEntry
    -             (key.getData(), key.getOffset(), key.getSize());
    -         DatabaseEntry tempData = new DatabaseEntry
    -             (data.getData(), data.getOffset(), data.getSize());
    -         tempKey.setPartial(0, 0, true);
    -         tempData.setPartial(0, 0, true);
              boolean lockedNextKey = false;
    
              /* Don't search for data if duplicates are not configured. */
    --- 955,966 ----
           * records following the given key and datum, lock the special EOF node
           * for the databaseImpl.
           */
    !     public boolean lockNextKeyForInsert(DatabaseEntry key,
    !                                         DatabaseEntry data,
    !                                         DatabaseEntry tempKey,
    !                                         DatabaseEntry tempData)
              throws DatabaseException {
    
              boolean lockedNextKey = false;
    
              /* Don't search for data if duplicates are not configured. */
    ***************
    *** 1011,1016 ****
    --- 1008,1014 ----
              if (!lockedNextKey) {
                  lockEofNode(LockType.RANGE_INSERT);
              }
    +         return lockedNextKey;
          }
    
          /**
    -----------------------------------------
    import java.io.File;
    import com.sleepycat.bind.tuple.StringBinding;
    import com.sleepycat.je.Database;
    import com.sleepycat.je.DatabaseConfig;
    import com.sleepycat.je.DatabaseEntry;
    import com.sleepycat.je.Environment;
    import com.sleepycat.je.EnvironmentConfig;
    import com.sleepycat.je.OperationStatus;
    import com.sleepycat.je.Transaction;
    import com.sleepycat.je.TransactionConfig;
    
    public class Test {
        public static void main(String[] args) throws Exception {
            EnvironmentConfig envConfig = new EnvironmentConfig();
            envConfig.setAllowCreate(true);
            envConfig.setTransactional(true);
            Environment env = new Environment(new File("./tmp"), envConfig);
    
            DatabaseConfig dbConfig = new DatabaseConfig();
            dbConfig.setAllowCreate(true);
            dbConfig.setTransactional(true);
            Database db = env.openDatabase(null, "foo", dbConfig);
    
            TransactionConfig txnConfig = new TransactionConfig();
            txnConfig.setSerializableIsolation(true);
            Transaction txn = env.beginTransaction(null, txnConfig);
    
            put(db, "k1", "1");
            put(db, "k2", "2");
            put(db, "k4", "4");
            put(db, "k5", "5");
    
            txn.abort();
            db.close();
            env.close();
        }
    
        static void put(Database db, String key, String data) throws Exception {
            DatabaseEntry keyEntry = new DatabaseEntry();
            DatabaseEntry dataEntry = new DatabaseEntry();
            StringBinding.stringToEntry(key, keyEntry);
            StringBinding.stringToEntry(data, dataEntry);
            OperationStatus status = db.put(null, keyEntry, dataEntry);
            if (status != OperationStatus.SUCCESS) {
                System.out.println(status);
            }
        }
    }