A Comparative Analysis of Adaptive Solutions for Grid Environments

Grid computing environments are distributed systems composed by heterogeneous and geographically distributed resources. This type of systems mainly emerged to satisfy the increasing computing power demand within the scientific community. Despite the advantages of such paradigm, there are still several challenges related to the discovery, monitoring and selection of grid resources. Moreover, the dynamic nature and changing characteristics of such environments worsen the applications performance. Thus, improving their efficiency is a fundamental issue. The present contribution analyses two self-adaptive solutions focused on enhancing the grid resource selection process by using resources in an efficient way. On the one hand, the Efficient Resources Selection model which is defined from the user’s point of view (it avoids controlling or modifying the infrastructure) and it is based on the Scatter Search method for achieving a suitable selection of resources. On the other hand, Montera2, a framework designed for addressing an efficient execution of distributed applications on the grid; it defines and employs a dynamic scheduling algorithm to determine the size and number of tasks to be executed. Both approaches have been tested on a real European infrastructure belonging to the well-known European Grid Infrastructure (EGI) project. The study also compares both solutions with the standard scheduling technique that governs this infrastructure, the gLiteWMS scheduler, showing a much better performance by reducing the final makespan by a factor of 20 if compared to the gLiteWMS scheduler. An analysis of task and time overheads for both approaches is also included. Furthermore, comparisons with many other solutions proposed in the literature are presented, showing the advantages of our approaches.

[1]  F. Glover,et al.  Handbook of Metaheuristics , 2019, International Series in Operations Research & Management Science.

[2]  Ali Gholami,et al.  Adaptive data management in the ARC Grid middleware , 2011 .

[3]  Ignacio Martín Llorente,et al.  Montera: A Framework for Efficient Execution of Monte Carlo Codes on Grid Infrastructures , 2013, Comput. Informatics.

[4]  Yang Gao,et al.  Adaptive grid job scheduling with genetic algorithms , 2005, Future Gener. Comput. Syst..

[5]  Fatos Xhafa,et al.  Metaheuristics for scheduling in distributed computing environments , 2008 .

[6]  Eduardo Huedo,et al.  A framework for adaptive execution in grids , 2004, Softw. Pract. Exp..

[7]  Jason Maassen,et al.  Self-adaptive applications on the grid , 2007, PPoPP.

[8]  Eddy Caron,et al.  Evaluation of Meta-scheduler Architectures and Task Assignment Policies for High Throughput Computing , 2005 .

[9]  Uwe Schwiegelshohn,et al.  Adaptive parallel job scheduling with resource admissible allocation on two-level hierarchical grids , 2012, Future Gener. Comput. Syst..

[10]  Daniel S. Katz,et al.  Pegasus: A framework for mapping complex scientific workflows onto distributed systems , 2005, Sci. Program..

[11]  José Simão,et al.  A checkpointing‐enabled and resource‐aware Java Virtual Machine for efficient and robust e‐Science applications in grid environments , 2012, Concurr. Comput. Pract. Exp..

[12]  Rajkumar Buyya,et al.  Coordinated rescheduling of Bag‐of‐Tasks for executions on multiple resource providers , 2012, Concurr. Comput. Pract. Exp..

[13]  Fatos Xhafa,et al.  Computational models and heuristic methods for Grid scheduling problems , 2010, Future Gener. Comput. Syst..

[14]  Eduardo Huedo,et al.  Federation of TeraGrid, EGEE and OSG infrastructures through a metascheduler , 2010, Future Gener. Comput. Syst..

[15]  Eduardo Huedo,et al.  Data location-aware job scheduling in the grid. Application to the GridWay metascheduler , 2010 .

[16]  Daniel S. Katz,et al.  Distributed computing practice for large‐scale science and engineering applications , 2013, Concurr. Comput. Pract. Exp..

[17]  Maode Ma,et al.  A hybrid load balancing strategy of sequential tasks for grid computing environments , 2009, Future Gener. Comput. Syst..

[18]  Filip De Turck,et al.  Evaluation of replication and rescheduling heuristics for grid systems with varying resource availability , 2006 .

[19]  Larry Carter,et al.  Scheduling strategies for master-slave tasking on heterogeneous processor platforms , 2004, IEEE Transactions on Parallel and Distributed Systems.

[20]  Eduardo Huedo,et al.  A decentralized model for scheduling independent tasks in Federated Grids , 2009, Future Gener. Comput. Syst..

[21]  Stephen A. Jarvis,et al.  Self-adaptive and self-optimising resource monitoring for dynamic grid environments , 2004 .

[22]  Hai Jin,et al.  Dependable Grid Workflow Scheduling Based on Resource Availability , 2012, Journal of Grid Computing.

[23]  Eduardo Huedo,et al.  Loosely-coupled loop scheduling in computational grids , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[24]  Franck Cappello,et al.  Coordinated checkpoint versus message log for fault tolerant MPI , 2004, 2003 Proceedings IEEE International Conference on Cluster Computing.

[25]  Sebastián Reyes,et al.  Derivation of self-scheduling algorithms for heterogeneous distributed computer systems: Application to internet-based grids of computers , 2009, Future Gener. Comput. Syst..

[26]  Francine Berman,et al.  Adaptive Computing on the Grid Using AppLeS , 2003, IEEE Trans. Parallel Distributed Syst..

[27]  Hugues Benoit-Cattin,et al.  Monte Carlo simulation on heterogeneous distributed systems: A computing framework with parallel merging and checkpointing strategies , 2013, Future Gener. Comput. Syst..

[28]  Sathish S. Vadhiyar,et al.  Adaptive Executions of Multi-Physics Coupled Applications on Batch Grids , 2011, Journal of Grid Computing.

[29]  Derek Groen,et al.  On the Origin of Grid Species: The Living Application , 2009, ICCS.

[30]  Marian Bubak,et al.  Processing moldable tasks on the grid: Late job binding with lightweight user-level overlay , 2011, Future Gener. Comput. Syst..

[31]  Michael E Alfaro,et al.  Comparative performance of Bayesian and AIC-based measures of phylogenetic model uncertainty. , 2006, Systematic biology.

[32]  Horacio González-Vélez,et al.  Adaptive structured parallelism for distributed heterogeneous architectures: a methodological approach with pipelines and farms , 2010, Concurr. Comput. Pract. Exp..

[33]  Chris R. Jesshope,et al.  Parallel Computers 2: Architecture, Programming and Algorithms , 1981 .

[34]  Rajkumar Buyya,et al.  A taxonomy and survey on autonomic management of applications in grid computing environments , 2011 .

[35]  Wolfgang Schreiner,et al.  Austrian Grid Austrian Grid Report on Experiments with Globus 4 and gLite Document Identi , 2008 .

[36]  Andreas Haas,et al.  Standardization of an API for Distributed Resource Management Systems , 2007, Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07).

[37]  Selim G. Akl,et al.  Scheduling Algorithms for Grid Computing: State of the Art and Open Problems , 2006 .

[38]  Bertram Ludäscher,et al.  Kepler: an extensible system for design and execution of scientific workflows , 2004, Proceedings. 16th International Conference on Scientific and Statistical Database Management, 2004..

[39]  Yaohang Li,et al.  Grid-Based Monte Carlo Application , 2002, GRID.

[40]  Rajkumar Buyya,et al.  A Taxonomy of Workflow Management Systems for Grid Computing , 2005, Proceedings of the 38th Annual Hawaii International Conference on System Sciences.

[41]  José Luis Vázquez-Poletti,et al.  A comparison between two grid scheduling philosophies: EGEE WMS and Grid Way , 2007, Multiagent Grid Syst..

[42]  Uwe Schwiegelshohn,et al.  Job Allocation Strategies with User Run Time Estimates for Online Scheduling in Hierarchical Grids , 2011, Journal of Grid Computing.

[43]  Antonio Juan Rubio-Montero,et al.  Improvements on the Fusion Code FAFNER2 , 2010, IEEE Transactions on Plasma Science.

[44]  Fatos Xhafa,et al.  Meta-heuristics for Grid Scheduling Problems , 2008 .

[45]  G. Bruce Berriman,et al.  Comparing FutureGrid, Amazon EC2, and Open Science Grid for Scientific Workflows , 2013, Computing in Science & Engineering.

[46]  E. Alba,et al.  Metaheuristic Procedures for Training Neutral Networks , 2006 .

[47]  Rajkumar Buyya,et al.  Workflow scheduling algorithms for grid computing , 2008 .

[48]  Dan C. Marinescu,et al.  Algorithms for Divisible Load Scheduling of Data-intensive Applications , 2010, Journal of Grid Computing.

[49]  Thomas Hérault,et al.  Improved message logging versus improved coordinated checkpointing for fault tolerant MPI , 2004, 2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935).

[50]  Sathish S. Vadhiyar,et al.  Self adaptivity in Grid computing , 2005, Concurr. Pract. Exp..

[51]  Chao-Tung Yang,et al.  Using a Performance-based Skeleton to Implement Divisible Load Applications on Grid Computing Environments , 2009, J. Inf. Sci. Eng..

[52]  Bertram Ludäscher,et al.  Kepler: an extensible system for design and execution of scientific workflows , 2004 .

[53]  Emmanouel A. Varvarigos,et al.  Implementing and evaluating scheduling policies in gLite middleware , 2013, Concurr. Comput. Pract. Exp..

[54]  Eduardo Huedo,et al.  Benchmarking of high throughput computing applications on Grids , 2006, Parallel Comput..

[55]  Pascal Bouvry,et al.  A Review on Task Performance Prediction in Multi-core Based Systems , 2011, 2011 IEEE 11th International Conference on Computer and Information Technology.

[56]  Celso C. Ribeiro,et al.  Scatter Search and Path-Relinking: Fundamentals, Advances, and Applications , 2010 .

[57]  José Herrera Sanz Modelo de programación para infraestructuras Grid computacionales , 2011 .

[58]  José Luis Vázquez-Poletti,et al.  CD-HIT Workflow Execution on Grids Using Replication Heuristics , 2008, 2008 Eighth IEEE International Symposium on Cluster Computing and the Grid (CCGRID).

[59]  Allan Porterfield,et al.  OpenMP task scheduling strategies for multicore NUMA systems , 2012, Int. J. High Perform. Comput. Appl..

[60]  Andrei Tchernykh,et al.  Multiple Workflow Scheduling Strategies with User Run Time Estimates on a Grid , 2012, Journal of Grid Computing.

[61]  Brian A. Wichmann,et al.  A Synthetic Benchmark , 1976, Comput. J..

[62]  F. Cappello,et al.  Blocking vs. Non-Blocking Coordinated Checkpointing for Large-Scale Fault Tolerant MPI , 2006, ACM/IEEE SC 2006 Conference (SC'06).

[63]  Greg J. Michaelson,et al.  Resource analyses for parallel and distributed coordination , 2013, Concurr. Comput. Pract. Exp..

[64]  Kuo-Chan Huang,et al.  Online scheduling of workflow applications in grid environments , 2011, Future Gener. Comput. Syst..

[65]  P. Andreo Monte Carlo techniques in medical radiation physics. , 1991, Physics in medicine and biology.

[66]  Daniel M. Batista,et al.  A survey of self-adaptive grids , 2010, IEEE Communications Magazine.