RAS: A Task Scheduling Algorithm Based on Resource Attribute Selection in a Task Scheduling Framework

With the advent of big data and cloud computing era, scheduling and executing large-scale computing tasks effectively and allocating resources to tasks reasonably are becoming a quite challenging problem. And there is theoretical significance to research on efficient scheduling algorithm to improve resource utilization and task execution efficiency. We present a scheduling algorithm based on resource attribute selection RAS by sending a set of test tasks to an execution node to determine its resource attributes before a task is scheduled; and then selecting the optimal node to execute a task according to its resource requirements and the fitness between the resource node and the task, which also uses history task data if exists. We 1 give a formal definition of the resource attributes and 2 compute the fitness of the resource nodes and 3 store the information of node selection for next round. We integrate our algorithm into the Gearman scheduling framework, and through comparison with three other scheduling frameworks, we find out there is significant improvement in resource selection and resource utilization using RAS. The throughput of the RAS with work-stealing, WS is at least 30% higher than the other frameworks and the resource utilization of RAS WS reaches 0.94. The algorithm can make a good model for practical large scale application scheduling.

[1]  Douglas Thain,et al.  Distributed computing in practice: the Condor experience , 2005, Concurr. Pract. Exp..

[2]  Ming Q. Xu Effective metacomputing using LSF Multicluster , 2001, Proceedings First IEEE/ACM International Symposium on Cluster Computing and the Grid.

[3]  Patrick Wendell,et al.  Sparrow: Scalable Scheduling for Sub-Second Parallel Jobs , 2013 .

[4]  Yong Zhao,et al.  Scientific Workflow Systems for 21st Century, New Bottle or New Wine? , 2008, 2008 IEEE Congress on Services - Part I.

[5]  E. Ilavarasan,et al.  Performance Effective Task Scheduling Algorithm for Heterogeneous Computing System , 2005, The 4th International Symposium on Parallel and Distributed Computing (ISPDC'05).

[6]  Yong Zhao,et al.  Falkon: a Fast and Light-weight tasK executiON framework , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).

[7]  Gregor von Laszewski,et al.  QoS guided Min-Min heuristic for grid task scheduling , 2003, Journal of Computer Science and Technology.

[8]  Cevdet Aykanat,et al.  Iterative-Improvement-Based Heuristics for Adaptive Scheduling of Tasks Sharing Files on Heterogeneous Master-Slave Environments , 2006, IEEE Transactions on Parallel and Distributed Systems.

[9]  Fumie Costen,et al.  Investigation to make best use of LSF with high efficiency , 1999, ICWC 99. IEEE Computer Society International Workshop on Cluster Computing.

[10]  Yong Zhao,et al.  Realizing Fast, Scalable and Reliable Scientific Computations in Grid Environments , 2008, ArXiv.

[11]  Randy H. Katz,et al.  Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center , 2011, NSDI.

[12]  Miron Livny,et al.  Condor: a distributed job scheduler , 2001 .

[13]  S. Ramachandram,et al.  Reliability-Aware Scheduling Based on a Novel Simulated Annealing in Grid , 2012, 2012 Fourth International Conference on Computational Intelligence and Communication Networks.

[14]  Nicholas Coleman Distributed Policy Specification and Interpretation with Classified Advertisements , 2012, PADL.

[15]  Yong Zhao,et al.  Cloud Computing and Grid Computing 360-Degree Compared , 2008, GCE 2008.

[16]  Kuspriyanto,et al.  Grid computing process improvement through computing resource scheduling using genetic algorithm and Tabu Search integration , 2012, 2012 7th International Conference on Telecommunication Systems, Services, and Applications (TSSA).

[17]  Chang Liu,et al.  An Insight into the Architecture of Condor - A Distributed Scheduler , 2009, 2009 International Symposium on Computer Network and Multimedia Technology.

[18]  Ku Ruhana Ku-Mahamud,et al.  Ant Colony Algorithm for Job Scheduling in Grid Computing , 2010, 2010 Fourth Asia International Conference on Mathematical/Analytical Modelling and Computer Simulation.

[19]  William Gropp,et al.  Beowulf Cluster Computing with Linux , 2003 .

[20]  Hongbin Zhang,et al.  Grid Load Balancing Scheduling Algorithm Based on Statistics Thinking , 2008, 2008 The 9th International Conference for Young Computer Scientists.

[21]  Michael J. Franklin,et al.  Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing , 2012, NSDI.

[22]  Francine Berman,et al.  Grid Computing: Making the Global Infrastructure a Reality , 2003 .

[23]  Patrick Wendell,et al.  Sparrow: distributed, low latency scheduling , 2013, SOSP.