Multi-criteria and satisfaction oriented scheduling for hybrid distributed computing infrastructures

Assembling and simultaneously using different types of distributed computing infrastructures (DCI) like Grids and Clouds is an increasingly common situation. Because infrastructures are characterized by different attributes such as price, performance, trust, and greenness, the task scheduling problem becomes more complex and challenging. In this paper we present the design for a fault-tolerant and trust-aware scheduler, which allows to execute Bag-of-Tasks applications on elastic and hybrid DCI, following user-defined scheduling strategies. Our approach, named Promethee scheduler, combines a pull-based scheduler with multi-criteria Promethee decision making algorithm. Because multi-criteria scheduling leads to the multiplication of the possible scheduling strategies, we propose SOFT, a methodology that allows to find the optimal scheduling strategies given a set of application requirements. The validation of this method is performed with a simulator that fully implements the Promethee scheduler and recreates an hybrid DCI environment including Internet Desktop Grid, Cloud and Best Effort Grid based on real failure traces. A set of experiments shows that the Promethee scheduler is able to maximize user satisfaction expressed accordingly to three distinct criteria: price, expected completion time and trust, while maximizing the infrastructure useful employment from the resources owner point of view. Finally, we present an optimization which bounds the computation time of the Promethee algorithm, making realistic the possible integration of the scheduler to a wide range of resource management software. We designed an overall multi-criteria task scheduling method for hybrid DCIs.The scheduling method allows a systematic integration of new scheduling criteria into it.We defined a methodology for finding optimal scheduling strategies.For the validation we consider both user and resource owners perspectives.We presented the experimental system built for the validation of the scheduling method.

[1]  Radu Prodan,et al.  Impact of Variable Priced Cloud Resources on Scientific Workflow Scheduling , 2012, Euro-Par.

[2]  Álvaro Enrique Arenas,et al.  Reputation management in collaborative computing systems , 2010, Secur. Commun. Networks.

[3]  Unai Arronategui,et al.  A task routing approach to large-scale scheduling , 2013, Future Gener. Comput. Syst..

[4]  Radu Prodan,et al.  Towards a general model of the multi-criteria workflow scheduling on the grid , 2009, Future Gener. Comput. Syst..

[5]  Jean Pierre Brans,et al.  HOW TO SELECT AND HOW TO RANK PROJECTS: THE PROMETHEE METHOD , 1986 .

[6]  Gilles Fedak,et al.  XtremWeb & Condor : sharing resources between Internet connected Condor pool , 2003, CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings..

[7]  Rajkumar Buyya,et al.  Cooperative and decentralized workflow scheduling in global grids , 2010, Future Gener. Comput. Syst..

[8]  Georges Da Costa,et al.  2005 IEEE International Symposium on Cluster Computing and the Grid , 2005, CCGRID.

[9]  Assaf Schuster,et al.  GridBot: execution of bags of tasks in multiple grids , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.

[10]  Moustafa Ghanem,et al.  Future Generation Computer Systems ( ) – Future Generation Computer Systems Enabling Cost-aware and Adaptive Elasticity of Multi-tier Cloud Applications , 2022 .

[11]  Gilles Fedak,et al.  Characterizing Result Errors in Internet Desktop Grids , 2007, Euro-Par.

[12]  Ian T. Foster,et al.  The Anatomy of the Grid: Enabling Scalable Virtual Organizations , 2001, Int. J. High Perform. Comput. Appl..

[13]  Alexandru Iosup,et al.  The Failure Trace Archive: Enabling Comparative Analysis of Failures in Diverse Distributed Systems , 2010, 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing.

[14]  Péter Kacsuk,et al.  Towards a Powerful European DCI Based on Desktop Grids , 2011, Journal of Grid Computing.

[15]  Yong Zhao,et al.  Falkon: a Fast and Light-weight tasK executiON framework , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).

[16]  Andrew S. Tanenbaum,et al.  Distributed systems: Principles and Paradigms , 2001 .

[17]  Gilles Fedak,et al.  Characterizing resource availability in enterprise desktop grids , 2007, Future Gener. Comput. Syst..

[18]  Alexandru Iosup,et al.  The performance of bags-of-tasks in large-scale distributed systems , 2008, HPDC '08.

[19]  Gilles Fedak,et al.  Advanced Promethee-Based Scheduler Enriched with User-Oriented Methods , 2013, GECON.

[20]  Jan Weglarz,et al.  Multicriteria, multi-user scheduling in grids with advance reservation , 2010, J. Sched..

[21]  Calvin J. Ribbens,et al.  Hybrid Computing - Where HPC meets grid and Cloud Computing , 2011, Future Gener. Comput. Syst..

[22]  Rajkumar Buyya,et al.  Task granularity policies for deploying bag-of-task applications on global grids , 2013, Future Gener. Comput. Syst..

[23]  Gilles Fedak,et al.  Desktop Grid Computing , 2012 .

[24]  R. F. Freund,et al.  Dynamic matching and scheduling of a class of independent tasks onto heterogeneous computing systems , 1999, Proceedings. Eighth Heterogeneous Computing Workshop (HCW'99).

[25]  Gilles Fedak,et al.  Using Promethee methods for multi-criteria pull-based scheduling on DCIs , 2012, 2012 IEEE 8th International Conference on E-Science.

[26]  Marc Frîncu,et al.  Multi-objective Meta-heuristics for Scheduling Applications with High Availability Requirements and Cost Constraints in Multi-Cloud Environments , 2011, 2011 Fourth IEEE International Conference on Utility and Cloud Computing.

[27]  Gilles Fedak,et al.  SpeQuloS: a QoS service for BoT applications using best effort distributed computing infrastructures , 2012, HPDC '12.

[28]  Matthias Ehrgott,et al.  Multiple criteria decision analysis: state of the art surveys , 2005 .

[29]  Gilles Fedak,et al.  XtremWeb: a generic global computing system , 2001, Proceedings First IEEE/ACM International Symposium on Cluster Computing and the Grid.

[30]  Eddy Caron,et al.  Definition, modelling and simulation of a grid computing scheduling system for high throughput computing , 2007, Future Gener. Comput. Syst..

[31]  Rafael Moreno-Vozmediano,et al.  Elastic management of cluster-based services in the cloud , 2009, ACDC '09.

[32]  Rajkumar Buyya,et al.  Article in Press Future Generation Computer Systems ( ) – Future Generation Computer Systems Cloud Computing and Emerging It Platforms: Vision, Hype, and Reality for Delivering Computing as the 5th Utility , 2022 .

[33]  Gilles Fedak,et al.  EDGeS: Bridging EGEE to BOINC and XtremWeb , 2009, Journal of Grid Computing.

[34]  Thilo Kielmann,et al.  Budget Estimation and Control for Bag-of-Tasks Scheduling in Clouds , 2011, Parallel Process. Lett..

[35]  David P. Anderson,et al.  BOINC: a system for public-resource computing and storage , 2004, Fifth IEEE/ACM International Workshop on Grid Computing.

[36]  Rajkumar Buyya,et al.  SLA-Based Scheduling of Bag-of-Tasks Applications on Power-Aware Cluster Systems , 2010, IEICE Trans. Inf. Syst..