2 Replies Latest reply on Aug 6, 2013 1:05 PM by 056822ec-b841-417a-a9f0-3392790b8865

    Whats NoSqlDatabase ..With Hadoop


      Hi I am not aware of it what it is... I am oracle rac dba on 11g rsl2 and .. I have seen one or two requirements on BigData ... What is Hadoop ..?  What is NoSqlDatabase ?  Is NoSqlDatabase is 12c version of cloud.. DO i need to update this new technology in oracle as Oracle DBA ... Can you guide me ....Thanks a LOT LOT LOT ..

        • 1. Re: Whats NoSqlDatabase ..With Hadoop

          The Oracle NoSQL Database is a compliment to the Oracle Database.  These days, there are new types of data and new workloads being serviced by database technologies, so its becoming more of a "right tool for the job" database landscape. 


          When the types of data in the application are frequently changing ( agile production apps ) or are semi-structured ( read nested ) and at the same time you have a lot of that data and it requires fairly simple low latency access to it ( e.g. get by key  or get a bunch in a key range ), then that is the kind of workload that fits with the Oracle NoSQL Database.  Because the data and its access is pretty simple, it is easy to replicate and spread around on lots of machines so it can be scaled out to handle a very large number of users and data.


          Hadoop on the other hand, is largely for processing unstructured data that is extremely large in volume.  Things like web log records that are looking a each and every click on a web site.  All those log events get dumped into a special file system (HDFS) and then Hadoop's MapReduce framework can grab big chunks of that data ( in parallel ) and scan through it all and filter and transform it into something more suited for one of the other database technologies.  So, more often than not, Hadoop is used along with either a NoSQL or a Relational database.  When Hadoop runs, it is a batch processes that literally touches every piece of data in all the files targeted by the MapReduce operation.  So, it is slow in some respects, though when you consider it can do this on a Petabyte of data, for this kind of workload it is in relative terms "fast'.


          The Oracle Database is pretty well understood, well suited for complex access of data in a well defined schema using SQL where JOINs are a fundamental aspect of the data access.  The various flavors allow you to run from basic small server installs to higher end multi core servers and even high end engineered systems when more data or speed is required and then also to things like RAC when high availability and scale up comes into play.


          More likely than not, all of these technologies will come to play in the data management space.  If you are a database guy, getting to know about them is a good thing, so I would recommend reading some documentation and getting aware of the different terminology and architecture.  Then when a project comes along you will be familiar with the discussions that will happen.


          Hope this helps,