Alea 2: job scheduling simulator

This work describes the Grid and cluster scheduling simulator Alea 2 designed for study, testing and evaluation of various job scheduling techniques. This event-based simulator is able to deal with common problems related to the job scheduling like the heterogeneity of jobs, resources, and the dynamic runtime changes such as the arrivals of new jobs or the resource failures and restarts. The Alea 2 is based on the popular GridSim toolkit [31] and represents a major extension of the Alea simulator, developed in 2007 [16]. The extension covers both improved design, extended functionality as well as the improved scalability and the higher simulation speed. Finally, new visualization interface was introduced into the simulator. The main part of the simulator is a complex scheduler which incorporates several common scheduling algorithms working either on the queue or the schedule (plan) based principle. Additional data structures are used to maintain information about the resource status, the objective functions and for collection and visualization of the simulation results. Many typical objectives such as the machine usage, the average slowdown or the average response time are included. The paper concludes with an example of the Alea 2 execution using a real-life workload, discussing also the scalability of the simulator.

[1]  John Levine,et al.  A fast, effective local search for scheduling independent jobs in heterogeneous computing environments , 2003 .

[2]  Jan Broeckhove,et al.  Scalability of Grid Simulators: An Evaluation , 2008, Euro-Par.

[3]  Fatos Xhafa,et al.  Meta-heuristics for Grid Scheduling Problems , 2008 .

[4]  Wilfried Jakob,et al.  Solving Scheduling Problems in Grid Resource Management Using an Evolutionary Algorithm , 2006, OTM Conferences.

[5]  Thomas Stützle,et al.  Stochastic Local Search: Foundations & Applications , 2004 .

[6]  Harold Enrique Castro Barrera,et al.  Desktop Grids and Volunteer Computing Systems , 2012 .

[7]  Jacques Chassin de Kergommeaux,et al.  Pajé, an interactive visualization tool for tuning multi-threaded parallel applications , 2000, Parallel Comput..

[8]  Dalibor Klusácek,et al.  EFFICIENT GRID SCHEDULING THROUGH THE INCREMENTAL SCHEDULE‐BASED APPROACH , 2011, Comput. Intell..

[9]  Henri Casanova,et al.  Scheduling distributed applications: the SimGrid simulation framework , 2003, CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings..

[10]  Iosif Legrand,et al.  The MONARC toolset for simulating large network-distributed processing systems , 2000, 2000 Winter Simulation Conference Proceedings (Cat. No.00CH37165).

[11]  Yves Caniou,et al.  Simbatch: An API for Simulating and Predicting the Performance of Parallel Resources Managed by Batch Systems , 2008, Euro-Par Workshops.

[12]  Rajkumar Buyya,et al.  Integrated Risk Analysis for a Commercial Computing Service in Utility Computing , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[13]  Dalibor Klusácek,et al.  Alea - Grid Scheduling Simulation Environment , 2007, PPAM.

[14]  Achim Streit,et al.  Scheduling in HPC Resource Management Systems: Queuing vs. Planning , 2003, JSSPP.

[15]  Fatos Xhafa,et al.  Use of genetic algorithms for scheduling jobs in large scale grid applications , 2006 .

[16]  Y. Mukaigawa,et al.  Large Deviations Estimates for Some Non-local Equations I. Fast Decaying Kernels and Explicit Bounds , 2022 .

[17]  Daniel C. Stanzione,et al.  Characterization of Bandwidth-Aware Meta-Schedulers for Co-Allocating Jobs Across Multiple Clusters , 2005, The Journal of Supercomputing.

[18]  Xin Liu,et al.  Validating and Scaling the MicroGrid: A Scientific Instrument for Grid Dynamics , 2004, Journal of Grid Computing.

[19]  Dalibor Klusácek,et al.  Comparison Of Multi-Criteria Scheduling Techniques , 2008, CoreGRID Integration Workshop.

[20]  Ming Q. Xu Effective metacomputing using LSF Multicluster , 2001, Proceedings First IEEE/ACM International Symposium on Cluster Computing and the Grid.

[21]  Xueyan Tang,et al.  Optimizing static job scheduling in a network of heterogeneous computers , 2000, Proceedings 2000 International Conference on Parallel Processing.

[22]  Rajkumar Buyya,et al.  A toolkit for modelling and simulating data Grids: an extension to GridSim , 2008, Concurr. Comput. Pract. Exp..

[23]  Ciprian Dobre,et al.  A Simulation Framework for Dependable Distributed Systems , 2008, 2008 International Conference on Parallel Processing - Workshops.

[24]  Satoshi Matsuoka,et al.  Overview of a performance evaluation system for global computing scheduling algorithms , 1999, Proceedings. The Eighth International Symposium on High Performance Distributed Computing (Cat. No.99TH8469).

[25]  David Abramson,et al.  Scheduling parameter sweep applications on global Grids: a deadline and budget constrained cost–time optimization algorithm , 2005, Softw. Pract. Exp..

[26]  P. Strevens Iii , 1985 .

[27]  F. Glover,et al.  Handbook of Metaheuristics , 2019, International Series in Operations Research & Management Science.

[29]  Hongbo Liu,et al.  Nature inspired meta-heuristics for grid scheduling: single and multi-objective optimization approaches , 2008 .

[30]  Ramin Yahyapour,et al.  Benefits of global grid computing for job scheduling , 2004, Fifth IEEE/ACM International Workshop on Grid Computing.

[31]  Chung Laung Liu,et al.  Scheduling Algorithms for Multiprogramming in a Hard-Real-Time Environment , 1989, JACM.

[32]  Jarek Nabrzyski,et al.  Grid scheduling simulations with GSSIM , 2007, 2007 International Conference on Parallel and Distributed Systems.

[33]  Hana Rudová,et al.  Complex Real-life Data Sets in Grid Simulations , 2009 .

[34]  Ross Mcnab,et al.  Simjava: A Discrete Event Simulation Library For Java , 1998 .

[35]  Jonathan Knudsen,et al.  Learning Java , 2000 .

[36]  Honbo Zhou,et al.  The EASY - LoadLeveler API Project , 1996, JSSPP.