4 Replies Latest reply: Jul 23, 2013 6:56 AM by 21402187-b8e1-49da-a7f4-c3a3242033a5 RSS

    What is the best big data solution for interactive queries of rows with up?

    1003216
      0 down vote favorite
           

      We have a simple table such as follows:

      ------------------------------------------------------------------------
      | Name | Attribute1 | Attribute2 | Attribute3 | ... | Attribute200 |
      ------------------------------------------------------------------------
      | Name1 | Value1 | Value2 | null | ... | Value3 |
      | Name2 | null | Value4 | null | ... | Value5 |
      | Name3 | Value6 | null | Value7 | ... | null |
      | ... |
      ------------------------------------------------------------------------

      But there could be up to hundreds of millions of rows/names. The data will be populated every hour or so.

      The goal is to get results for interactive queries on the data within a couple of seconds.

      Most queries look like:

      select count(*) from table
      where Attribute1 = Value1 and Attribute3 = Value3 and Attribute113 = Value113;

      The where clause contains arbitrary number of attribute name-value pairs.

      I'm new in big data and wondering what the best option is in terms of data store (MySQL, HBase, Cassandra, etc) and processing engine (Hadoop, Drill, Storm, etc) for interactive queries like above.