Hadoop MapReduce Job Scheduler Implementation and Analysis in Heterogeneous Environment

Hadoop MapReduce is one of the popular framework for BigData analytics. MapReduce cluster is shared among multiple users with heterogeneous workloads. When jobs are concurrently submitted to the cluster, resources are shared among them so system performance might be degrades. The issue here is that schedule the tasks and provide the fairness of resources to all jobs. Hadoop supports different schedulers than the default FIFO scheduler We started experiment on Hadoop FIFO, Fair and Capacity scheduler with heterogeneous workloads. Our aim is to compare the different job scheduler with heterogeneous workload and it is important to understand the task scheduler parameter, based on that we considered few parameter for the performance analysis.

[1]  Tom White,et al.  Hadoop: The Definitive Guide , 2009 .

[2]  Vladimir Vlassov,et al.  MapReduce: Limitations, Optimizations and Open Issues , 2013, 2013 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications.

[3]  Roman Trobec,et al.  Multicluster Hadoop Distributed File System , 2012, 2012 Proceedings of the 35th International Convention MIPRO.