Big Data and Hadoop

Today's world is a world of large data, ranging from some petabytes to zetabytes. This kind of large data also called as Big Data and 80% of the world's data is now in unstructured formats, which is created and held on the web. Over the next decade there will be 45 times more data than today. Many applications are required to deal with large data because the results should be obtained within a particular time limit. Hadoop is the solution of all the problems that arises due to massive amount of data which includes audio, video, text, images and etc. It is the open source framework developed by apache.

[1]  Ian Foster,et al.  The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.

[2]  Robert B. Ross,et al.  PVFS: A Parallel File System for Linux Clusters , 2000, Annual Linux Showcase & Conference.

[3]  Amit D. Joshi,et al.  An Automation Tool for Single-node and Multi-node Hadoop Cluster , 2013 .

[4]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[5]  Hairong Kuang,et al.  The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).