Performance analysis of MapReduce programs on Hadoop cluster

This paper discusses various MapReduce applications like pi, wordcount, grep, Terasort. We have shown experimental results of these applications on a Hadoop cluster. In this paper, performance of above application has been shown with respect to execution time and number of nodes. We find that as the number of nodes increases the execution time decreases. This paper is basically a research study of above MapReduce applications.

[1]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[2]  Tom White,et al.  Hadoop: The Definitive Guide , 2009 .

[3]  Ralf Lämmel,et al.  Google's MapReduce programming model - Revisited , 2007, Sci. Comput. Program..

[4]  Hairong Kuang,et al.  The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[5]  Jason Venner,et al.  Pro Hadoop , 2009 .

[6]  GhemawatSanjay,et al.  The Google file system , 2003 .