Notice of Violation of IEEE Publication PrinciplesHadoop Preemptive Deadline Constraint Scheduler

MapReduce is a programming model developed for processing large amount of data with parallel and distributed algorithm on a cluster of computing nodes. It provides convenient programming interface distributing data intensive works in a cluster environment such as Hadoop. Preemption is an effective approach for MapReduce scheduler in avoiding the delay of high priority jobs while allowing the system to be shared by regular jobs. In this paper the problem of deadline constraint scheduling on a MapReduce model is addressed. We present a new preemption approach which considers the remaining execution time of the job being executed in making the decision of preemption. Computer simulation demonstrates that the proposed scheme reduces the job execution time and waiting time in the queue compared to the existing scheme.

[1]  Magdalena Balazinska,et al.  ParaTimer: a progress indicator for MapReduce DAGs , 2010, SIGMOD Conference.

[2]  Kwang Mong Sim,et al.  A comparative review of job scheduling for MapReduce , 2011, 2011 IEEE International Conference on Cloud Computing and Intelligence Systems.

[3]  José A. B. Fortes,et al.  CloudBLAST: Combining MapReduce and Virtualization on Distributed Resources for Bioinformatics Applications , 2008, 2008 IEEE Fourth International Conference on eScience.

[4]  Nik Bessis,et al.  A Deadline Scheduler for Jobs in Distributed Systems , 2013, 2013 27th International Conference on Advanced Information Networking and Applications Workshops.

[5]  Jitender S. Deogun,et al.  Real-Time Divisible Load Scheduling for Cluster Computing , 2007, 13th IEEE Real Time and Embedded Technology and Applications Symposium (RTAS'07).

[6]  Kunle Olukotun,et al.  Map-Reduce for Machine Learning on Multicore , 2006, NIPS.

[7]  S. Habib,et al.  Introducing map-reduce to high end computing , 2008, 2008 3rd Petascale Data Storage Workshop.

[8]  Ruay-Shiung Chang,et al.  Simplifying MapReduce Data Processing , 2011, 2011 Fourth IEEE International Conference on Utility and Cloud Computing.

[9]  Geoffrey C. Fox,et al.  MapReduce for Data Intensive Scientific Analyses , 2008, 2008 IEEE Fourth International Conference on eScience.

[10]  Thomas Sandholm,et al.  Dynamic Proportional Share Scheduling in Hadoop , 2010, JSSPP.

[11]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[12]  H. E. Chandler,et al.  Technical writer's handbook , 1982, IEEE Transactions on Professional Communication.

[13]  Xu Liu,et al.  Evaluating task scheduling in hadoop-based cloud systems , 2013, 2013 IEEE International Conference on Big Data.

[14]  Kemafor Anyanwu,et al.  Scheduling Hadoop Jobs to Meet Deadlines , 2010, 2010 IEEE Second International Conference on Cloud Computing Technology and Science.

[15]  Scott Shenker,et al.  Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling , 2010, EuroSys '10.

[16]  R. Buyya,et al.  A budget constrained scheduling of workflow applications on utility Grids using genetic algorithms , 2006, 2006 Workshop on Workflows in Support of Large-Scale Science.

[17]  Hai Zhao,et al.  Preemptive behavior analysis and improvement of priority scheduling algorithms , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[18]  Zhao Li,et al.  Scheduling real-time workflow on MapReduce-based cloud , 2013, Third International Conference on Innovative Computing Technology (INTECH 2013).

[19]  Murali S. Kodialam,et al.  Scheduling in mapreduce-like systems for fast completion time , 2011, 2011 Proceedings IEEE INFOCOM.

[20]  Yuan Zhou,et al.  Preemptive Hadoop Jobs Scheduling under a Deadline , 2012, 2012 Eighth International Conference on Semantics, Knowledge and Grids.

[21]  Christine Morin,et al.  Proceedings of the 5th European conference on Computer systems , 2010, Eurosys 2010.