This discussion is archived
2 Replies Latest reply: Jul 20, 2012 3:59 PM by 916475 RSS

Claiming free space

MarceloF.Ochoa Oracle ACE
Currently Being Moderated
Hi:
I have four NoSQL storages with similar information, parameters and keys stored.
Only one of them is claiming his free space after a daily maintenance task which purge keys which are one months old.
The other three stores still growing with compacting or recovering the free space.
Is there a similar command to the SQL "alter table shrink space" which can be used with the NoSQL?
The stores which still growing have more than 50% of the space free comparing the value returned by a "du -sk /kvstore/dir/" and a computation of the code:
Iterator<KeyValueVersion> i = store.storeIterator(Direction.UNORDERED, 1000);
while (i.hasNext()) {
KeyValueVersion kv = i.next();
Key k = kv.getKey();
Value v = kv.getValue();
System.out.println(k.toString() + " " + v.getValue().length);
counter++;
// Do some work with the Value here
}
note that doing an addition to the value returned by v.getValue().length on all the keys stored is less than the 50% of amount returned by du.
Best regards, Marcelo.
  • 1. Re: Claiming free space
    Linda Lee Journeyer
    Currently Being Moderated
    Marcelo,

    I think you are saying you have four distinct NoSQL deployments -- not that you have four nodes in a single store?

    Disk space reclamation for NoSQL DB nodes is automatic, and there is no explicit user level command to reclaim space. The underlying storage is record based, as opposed to page based, so space reclamation, or cleaning, as we call it, should be responsive to deletions and updates to the store.

    It is not unusual for user's data to be 50% or less of the user disk space, because additional space is used for internal metadata, such as indices, per-record headers, etc, and by the append-only log structured storage that is used to store data. But if the four stores have similar data, one would expect the same general disk space usage. Things to think about are:
    - how do the usage patterns vary? Does the store that is different have a very different rate of or mix of write operations? Is there a very different read access pattern (though reads should not matter that much)
    - do the different stores have different resources available? In particular, is the memory available different, and/or has cache size been set explicitly and differently on the stores?
    - we have two known issues with disk space reclamation in this release, which we hope to fix in the next.
    (a) SR # 21069 - disk space reclamation can stall in some cases if all write operations cease. In particular, this can show up in some cases if a lot of data is loaded into the store, and then all folowing operations are read only
    (b) SR #21488 disk space cleaning can be inefficient if record key is large and the record value is tiny

    There are per-node statistics that are off by default. Enabling them and comparing a node on the well behaved system with a node on the problem system can provide more information. http://www.oracle.com/technetwork/database/nosqldb/learnmore/nosqldb-faq-518364.html#ReplicationNodeparameters describes the collectEnvStats and statsInterval parameters, and you would want to set "collectEnvStats=true", and "statsInterval=5 MINUTES"
  • 2. Re: Claiming free space
    916475 Newbie
    Currently Being Moderated
    Hi Linda:
    Thanks for your quick reply.
    Regarding to four deploy you are right, four different stores are using 8 physical servers (two by stores).
    The usage patterns is:
    - Records tons of transactions (B2B XML messages) during the day
    - A few record-reads by day (only some checking, heartbeat, and manual checking of failed transactions)
    - A daily job which deletes many of the one-month-old records and keeping a 2% of stored-records.
    - A daily job which uploads (NoSQL reads) the 2% of records to the RDBMS
    All stores have identical hardware (CPU, disk, memory).
    The only difference between the stores which works as expected is that it receives less messages a day, around 50% of the traffic compared to the others.
    SR # 21069 is not applicable, all the stores receive continually write during all days.
    SR #21488 is not applicable, key length is 80 characters and record value is 21Kbytes.
    I'll configure the stores to collect stats and I'll try to see if some stats give me a signal of what happens.
    Best regards, Marcelo.

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points