This discussion is archived
2 Replies Latest reply: Oct 3, 2012 4:48 PM by Johan Louwers RSS

Linux in Big Data projects

941430 Newbie
Currently Being Moderated
Hey guys, we will be interested in learning from your experience in using Linux in Big Data projects. Has anyone used Hadoop, or MapR or Horton Works on Linux and any experiences you may have had on these. I am more interested in knowing if a certain distribution of Linux is better supported for Hadoop and why? Also would like to know if anyone is using Gluster, and if so, are there any other alternatives similar to Gluster?
  • 1. Re: Linux in Big Data projects
    Pinela Journeyer
    Currently Being Moderated
    Hi,

    I've tried the cloudera VM image, that comes pre-configured with everything needed (comes with CentOS 5.8 ) and for the simple tests it allowed the execution of mapreduce jobs.
    https://ccp.cloudera.com/display/SUPPORT/Cloudera's+Hadoop+Demo+VM

    BR,
    Pinela.
  • 2. Re: Linux in Big Data projects
    Johan Louwers Explorer
    Currently Being Moderated
    Experience tells me that it can be good to "build" your own Linux distro if you are intending to deploy it on a massive scale. Reason for this is that you can start with a very very bare minimum installation and add only the functions and features to it you really need. If you use a standard distro you most likely will get all kinds of functions and processes you do not need and who all do take some of your resources.

    So, just some food for thought. :-)

    Regards,
    Johan Louwers.

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points