Self-Learning MapReduce Scheduler in Multi-job Environment

Hadoop, as the most widely adopted open-source implementation of MapReduce framework, makes MapReduce widely accessible. However, it is currently limited by its default MapReduce scheduler. To achieve better performance, the scheduler should take into consideration nodes' computing power and system resources in heterogeneous environment. Further more, from job perspective, tasks' non-linear progress is also an important factor. Some research work has been carried out to enhance the performance of MapReduce, but they are not satisfactory in terms of considering characteristics of both nodes and jobs. To overcome this drawback, we propose a Self-Learning MapReduce Scheduler (SLM), which outperforms the existing schedulers in multi-job environment. Since competitions on system resources may make a task's progress unpredictable, SLM determines the progress of each job based on its own historical information. In particular, on the self-learning stage of a job, with the feedback information from the first few tasks, SLM calculates the task phase weights. With these phase weights, SLM can obtain more accurate execution time estimation, which is the most important condition to finding stragglers (slow tasks). Experimental results show that, SLM can effectively improve the accuracy of execution time estimation and straggler identification, leading to the rational utilization of resources and shortening jobs' execution time especially in multi-job environment.

[1]  Yi-Ru Chen,et al.  Design Dynamic Data Allocation Scheduler to Improve MapReduce Performance in Heterogeneous Clouds , 2012, 2012 IEEE Ninth International Conference on e-Business Engineering.

[2]  Benjamin Rose,et al.  Supporting MapReduce on large-scale asymmetric multi-core clusters , 2009, OPSR.

[3]  Utpal Biswas,et al.  A smart job scheduling system for cloud computing service providers and users: Modeling and simulation , 2012, 2012 1st International Conference on Recent Advances in Information Technology (RAIT).

[4]  Sanjay Ghemawat,et al.  MapReduce: a flexible data processing tool , 2010, CACM.

[5]  Quan Chen,et al.  SAMR: A Self-adaptive MapReduce Scheduling Algorithm in Heterogeneous Environment , 2010, 2010 10th IEEE International Conference on Computer and Information Technology.

[6]  S. S. Islam,et al.  Next generation of computing through cloud computing technology , 2012, 2012 25th IEEE Canadian Conference on Electrical and Computer Engineering (CCECE).

[7]  Shufen Zhang,et al.  Cloud Computing Research and Development Trend , 2010, 2010 Second International Conference on Future Networks.

[8]  Qiang Zhang,et al.  The Characteristics of Cloud Computing , 2010, 2010 39th International Conference on Parallel Processing Workshops.

[9]  Chris Rose,et al.  A Break in the Clouds: Towards a Cloud Definition , 2011 .

[10]  Hai Jin,et al.  Maestro: Replica-Aware Map Scheduling for MapReduce , 2012, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012).

[11]  M. A. Ullah,et al.  Cloud computing for future generation of computing technology , 2012, 2012 IEEE International Conference on Cyber Technology in Automation, Control, and Intelligent Systems (CYBER).

[12]  Chen He,et al.  HOG: Distributed Hadoop MapReduce on the Grid , 2012, 2012 SC Companion: High Performance Computing, Networking Storage and Analysis.

[13]  Rajkumar Buyya,et al.  MapReduce Programming Model for .NET-Based Cloud Computing , 2009, Euro-Par.

[14]  J. Morris Chang,et al.  QoS-Aware Data Replication for Data-Intensive Applications in Cloud Computing Systems , 2013, IEEE Transactions on Cloud Computing.

[15]  Honggang Wang,et al.  Performance Analysis of Media Cloud-Based Multimedia Systems With Retrying Fault-Tolerance Technique , 2014, IEEE Systems Journal.

[16]  Luo Junzhou,et al.  Cloud computing:architecture and key technologies , 2011 .

[17]  Randy H. Katz,et al.  Improving MapReduce Performance in Heterogeneous Environments , 2008, OSDI.