Cloud Computing—Task scheduling based on genetic algorithms

Cloud Computing is a cutting edge technology for managing and delivering services over the Internet. Map-Reduce is the programming model used in cloud computing for processing large data sets in parallel over huge clusters. In order to increase efficiency, a good task scheduling is needed. Genetic algorithms are very useful and accurate in finding solutions to large scale optimization problems, such as task scheduling. They have gained immense popularity over last few years as a robust and easily adaptable search technique. Hadoop, the open source implementation of Map-Reduce, has several task schedulers available (FIFO, Fair, Capacity Schedulers), but neither one of them is focused on minimizing the global execution time. The goal of this project is to improve Hadoop's functionality by implementing a scheduler based on a genetic algorithm, solving the stated problem.

[1]  Lalit M. Patnaik,et al.  Adaptive probabilities of crossover and mutation in genetic algorithms , 1994, IEEE Trans. Syst. Man Cybern..

[2]  Gurvinder Singh,et al.  Heuristics Based Genetic Algorithm for Scheduling Static Tasks in Homogeneous Parallel System , 2022 .

[3]  Jian Xie,et al.  Independent Tasks Scheduling Based on Genetic Algorithm in Cloud Computing , 2009, 2009 5th International Conference on Wireless Communications, Networking and Mobile Computing.

[4]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[5]  Jitender S. Deogun,et al.  Real-Time Divisible Load Scheduling for Cluster Computing , 2007, 13th IEEE Real Time and Embedded Technology and Applications Symposium (RTAS'07).

[6]  Malgorzata Steinder,et al.  Performance-driven task co-scheduling for MapReduce environments , 2010, 2010 IEEE Network Operations and Management Symposium - NOMS 2010.

[7]  Thomas Sandholm,et al.  Dynamic Proportional Share Scheduling in Hadoop , 2010, JSSPP.

[8]  Matei Zaharia,et al.  Job Scheduling for Multi-User MapReduce Clusters , 2009 .

[9]  Scott Shenker,et al.  Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling , 2010, EuroSys '10.

[10]  Kemafor Anyanwu,et al.  Scheduling Hadoop Jobs to Meet Deadlines , 2010, 2010 IEEE Second International Conference on Cloud Computing Technology and Science.

[11]  Albert Y. Zomaya,et al.  Observations on Using Genetic Algorithms for Dynamic Load-Balancing , 2001, IEEE Trans. Parallel Distributed Syst..

[12]  Jong Kim,et al.  On-line scheduling of scalable real-time tasks on multiprocessor systems , 2003, J. Parallel Distributed Comput..

[13]  R. Buyya,et al.  A budget constrained scheduling of workflow applications on utility Grids using genetic algorithms , 2006, 2006 Workshop on Workflows in Support of Large-Scale Science.