This discussion is archived
4 Replies Latest reply: Jul 23, 2013 4:56 AM by 21402187-b8e1-49da-a7f4-c3a3242033a5 RSS

What is the best big data solution for interactive queries of rows with up?

1003216 Newbie
Currently Being Moderated
0 down vote favorite
     

We have a simple table such as follows:

------------------------------------------------------------------------
| Name | Attribute1 | Attribute2 | Attribute3 | ... | Attribute200 |
------------------------------------------------------------------------
| Name1 | Value1 | Value2 | null | ... | Value3 |
| Name2 | null | Value4 | null | ... | Value5 |
| Name3 | Value6 | null | Value7 | ... | null |
| ... |
------------------------------------------------------------------------

But there could be up to hundreds of millions of rows/names. The data will be populated every hour or so.

The goal is to get results for interactive queries on the data within a couple of seconds.

Most queries look like:

select count(*) from table
where Attribute1 = Value1 and Attribute3 = Value3 and Attribute113 = Value113;

The where clause contains arbitrary number of attribute name-value pairs.

I'm new in big data and wondering what the best option is in terms of data store (MySQL, HBase, Cassandra, etc) and processing engine (Hadoop, Drill, Storm, etc) for interactive queries like above.

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points