This discussion is archived
10 Replies Latest reply: Jan 16, 2012 5:56 PM by Charles Lamb RSS

Durability option: writing to stage storage

906230 Newbie
Currently Being Moderated
Hi all,

Please, can someone explain how does that durability option work?
Where and how can we set that option?
What are the aim and the benefit of that option?
Is that option available in the current version of Oracle NoSQL ?
Thanks.
  • 1. Re: Durability option: writing to stage storage
    Charles Lamb Pro
    Currently Being Moderated
    user962305 wrote:
    Hi all,

    Please, can someone explain how does that durability option work?
    Where and how can we set that option?
    What are the aim and the benefit of that option?
    Is that option available in the current version of Oracle NoSQL ?
    Please see http://docs.oracle.com/cd/NOSQL/html/javadoc/oracle/kv/KVStore.html for information on the API including when/where you can use the Durability option. If you have more questions after reading through that, I'll be happy to answer them.

    In general, Durability is meant to allow the program to specify, on a per-operation basis, the level of durability for writes and updates.

    Yes, it is available in all versions of Oracle NoSQL.

    Charles Lamb

    Edited by: Charles Lamb on Jan 6, 2012 9:02 AM
  • 2. Re: Durability option: writing to stage storage
    896774 Newbie
    Currently Being Moderated
    Excuse-me Charles, I was talking of the two dimensions of durability I read here
    http://www.nocoug.org/download/2011-11/Marie-Anne_Neimat_NoSQL_Database.pdf page 21

    It is written there that we can choose if we want to commit to RAM or commit to Disk.
    I can not see any information on that flexibility any where? How and when can we choose that?
  • 3. Re: Durability option: writing to stage storage
    greybird Expert
    Currently Being Moderated
    Both dimensions to durability are specified using the API Charlie pointed to. The Durability parameter describes both. Also see:
    http://docs.oracle.com/cd/NOSQL/html/javadoc/oracle/kv/Durability.html
    --mark                                                                                                                                                                                                                                                                                                                                                                                                                                   
  • 4. Re: Durability option: writing to stage storage
    Charles Lamb Pro
    Currently Being Moderated
    There are three elements of Durability: master sync policy, replica sync polcy, and replica ack policy. The first two specify the level of durability for a transaction's data on the master and replica, resp. The last part lets the user specify the number of replicas which must ack a transaction before it is ack'd to the user.

    Sync policy (the first two elements of durability that I mentioned above) can be either no_sync (just write to the JVM memory), write_no_sync (just write to the OS but do not force to disk), or sync (force to disk). I believe this is what you were referring to. The javadoc that I (and Mark) referenced above has more details.

    Charles Lamb
  • 5. Re: Durability option: writing to stage storage
    906230 Newbie
    Currently Being Moderated
    Sorry,
    I am not sure I have understood sync policy.
    Documentation says:
    NO_SYNC
    Do not write or synchronously flush the log on transaction commit.
    SYNC
    Write and synchronously flush the log on transaction commit.
    WRITE_NO_SYNC
    Write but do not synchronously flush the log on transaction commit.

    What do you mean by “flush the log”?
    Where do you need to write apart form log file?
    Are there a log file for current transactions and another file for committed transactions?
    With write_no_sync, when do you decide to flush the log? When it is full?
    With sync, if there is a crash why should we loose data if there are on disk even if there was no flush on the log?
    Thanks
  • 6. Re: Durability option: writing to stage storage
    greybird Expert
    Currently Being Moderated
    There is only one log. NoSQL DB uses BDB Java Edition, which is an append-only storage system.

    By "write" we mean issue a file system write. Such writes are not guaranteed to be durable, because the file system and hardware may buffer the data. An application (JVM) crash after a write will not cause data loss, but a system/OS crash may cause data loss.

    By "synchronously flush" we mean issue a file system fsync. This pushes the data all the way to the storage device, so data loss will not occur even if there is a system/OS crash.

    --mark                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
  • 7. Re: Durability option: writing to stage storage
    62600 Newbie
    Currently Being Moderated
    So in plain english you are saying:
    NO_SYNC = NO_WRITE_NO_SYNC
    WRITE_NO_SYNC = WRITE_NO_SYNC
    SYNC = WRITE_AND_SYNC
    Correct?
  • 8. Re: Durability option: writing to stage storage
    62600 Newbie
    Currently Being Moderated
    Reading it again my previous post is wrong. It should be:
    NO_SYNC = NO_SYNC
    WRITE_NO_SYNC = NO_SYNC_HOWEVER_WRITE
    SYNC = SYNC
    ?
    Probably like user962305, the "write" wording is confusing me. I expect "write" as the ultimate operation, somehow more glorified than "sync".
  • 9. Re: Durability option: writing to stage storage
    greybird Expert
    Currently Being Moderated
    Your previous post was correct so I'm going to repeat it:

    NO_SYNC = no write, no fsync
    WRITE_NO_SYNC = write, no fsync
    SYNC = write and fsync

    I think these names are historical and I can see how they are confusing.

    --mark                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
  • 10. Re: Durability option: writing to stage storage
    Charles Lamb Pro
    Currently Being Moderated
    greybird wrote:
    Your previous post was correct so I'm going to repeat it:

    NO_SYNC = no write, no fsync
    WRITE_NO_SYNC = write, no fsync
    SYNC = write and fsync

    I think these names are historical and I can see how they are confusing.
    To be even more specific, write means that the Java program (BDB JE) issues a RandomAccessFile.write() call which issues a write(2) call to the underlying file system. Hence, the data goes from the JVM to the OS. This protects against JVM failure, but not system failure.

    Charles Lamb

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points