This discussion is archived
1 Reply Latest reply: Feb 7, 2013 12:41 PM by greybird RSS

Read optimization time-series data

989556 Newbie
Currently Being Moderated
I am using Berkeley DB JE to store fairly high frequency (10hz) time-series data collected from ~80 sensors. The idea is to import a large number of csv files with this data, and allow quick access to time ranges of data to plot with a web front end. I have created a "sample" entity to hold these sampled metrics, indexed by the time stamp. My entity looks like this.

@Entity
public class Sample {

     // Unix time; seconds since Unix epoch
     @PrimaryKey
     private double time;
     
     private Map<String, Double> metricMap = new LinkedHashMap<String, Double>();
////////////////////////////////////////////////////////////////////////////

as you can see, there is quite a large amount of data for each entity (~70 - 80 doubles), and I'm not sure storing them in this way is best. This is my first question.

I am accessing the db from a web front end. I am not too worried about insertion performance, as this doesn't happen that often, and generally all at one time in bulk. For smaller ranges (~1-2 hr worth of samples) the read performance is decent enough for web calls. For larger ranges, the read operations take quite a while. What would be the best approach for configuring this application?

Also, I want to define granularity of samples. Basically, If the number of samples returned by a query is very large, I want to only return a fraction of the samples. Is there an easy way to count the number of entities that will be iterated over with a cursor without actually iterating over them?

Here are my current configuration params.

environmentConfig.setAllowCreateVoid(true);
          environmentConfig.setTransactionalVoid(true);
          environmentConfig.setTxnNoSyncVoid(true);
          environmentConfig.setCacheModeVoid(CacheMode.EVICT_LN);
          environmentConfig.setCacheSizeVoid(1000000000);
          
          databaseConfig.setAllowCreateVoid(true);
          databaseConfig.setTransactionalVoid(true);
          databaseConfig.setCacheModeVoid(CacheMode.EVICT_LN);
  • 1. Re: Read optimization time-series data
    greybird Expert
    Currently Being Moderated
    Hi Ben, sorry for the slow response.
    as you can see, there is quite a large amount of data for each entity (~70 - 80 doubles), and I'm not sure storing them in this way is best. This is my first question.
    That doesn't sound like a large record, so I don't see a problem. If the map keys are repeated in each record, that's wasted space that you might want to store differently.
    For larger ranges, the read operations take quite a while. What would be the best approach for configuring this application?
    What isolation level do you require? Do you need the keys and the data? If the amount you're reading is a significant portion of the index, have you looked at using DiskOrderedCursor?
    Also, I want to define granularity of samples. Basically, If the number of samples returned by a query is very large, I want to only return a fraction of the samples. Is there an easy way to count the number of entities that will be iterated over with a cursor without actually iterating over them?
    Not currently. Using the DPL, reading with a key-only cursor is the best available option. If you want to drop down to the base API, you can use Cursor.skipNext and skipPrev, which are further optimized.
    environmentConfig.setAllowCreateVoid(true);
    Please use the method names without the Void suffix -- those are just for bean editors.

    --mark                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points