Gestion dynamique des tâches dans les grappes, une approche à base de machines virtuelles. (Online Management of Jobs in Clusters using Virtual Machines)

Les gestionnaires de ressources reposant sur une gestion dynamique des tâches permettent une utilisation efficace des ressources des grappes de serveurs. Ils mettent en oeuvre pour cela des mecanismes manipulant a la volee l'etat des tâches et leur placement sur les differents noeuds de la grappe. En pratique, ces strategies d'ordonnancement ad-hoc s'adaptent difficilement aux grappes. En effet, celles-ci ne permettent pas necessairement une manipulation fiable des tâches et peuvent imposer des contraintes d'ordonnancement specifiques. Dans cette these, nous nous sommes fixes comme objectif de faciliter le developpement de gestionnaires de ressources bases sur une gestion dynamique des tâches. Pour cela, nous avons retenu une architecture a base de machines virtuelles qui executent les tâches des utilisateurs dans leur propre environnement logiciel tout en proposant les primitives necessaires a la manipulation de celles-ci de maniere non-intrusive. Nous avons egalement propose une approche autonome optimisant en continu l'ordonnancement des tâches. Les strategies d'ordonnancement sont implementees au moyen de la programmation par contraintes qui permet de definir de maniere flexible des problemes d'ordonnancement et de les resoudre. Nous avons valide notre approche par le developpement et l'evaluation du prototype Entropy, support pour l'implementation de differentes strategies d'ordonnancement. Celles-ci ont pu repondre efficacement a des problemes concrets et actuels.

[1]  Raj Vaswani,et al.  A dynamic processor allocation policy for multiprogrammed shared-memory multiprocessors , 1993, TOCS.

[2]  Wu-chun Feng,et al.  The Quadrics Network: High-Performance Clustering Technology , 2002, IEEE Micro.

[3]  Borja Sotomayor,et al.  Combining batch execution and leasing using virtual machines , 2008, HPDC '08.

[4]  Nicolas Beldiceanu,et al.  Global Constraint Catalog , 2005 .

[5]  Bernd Freisleben,et al.  Xen and the Art of Cluster Scheduling , 2006, First International Workshop on Virtualization Technology in Distributed Computing (VTDC 2006).

[6]  Akshat Verma,et al.  pMapper: Power and Migration Cost Aware Application Placement in Virtualized Systems , 2008, Middleware.

[7]  Gautam Kar,et al.  Application Performance Management in Virtualized Server Environments , 2006, 2006 IEEE/IFIP Network Operations and Management Symposium NOMS 2006.

[8]  Douglas Thain,et al.  Distributed computing in practice: the Condor experience , 2005, Concurr. Pract. Exp..

[9]  Toby Walsh,et al.  Handbook of Constraint Programming (Foundations of Artificial Intelligence) , 2006 .

[10]  Larry Rudolph,et al.  Distributed hierarchical control for parallel processing , 1990, Computer.

[11]  Cynthia E. Irvine,et al.  Analysis of the Intel Pentium's Ability to Support a Secure Virtual Machine Monitor , 2000, USENIX Security Symposium.

[12]  James W. Layland,et al.  Scheduling Algorithms for Multiprogramming in a Hard-Real-Time Environment , 1989, JACM.

[13]  S. Gribble,et al.  Scale and performance in the Denali isolation kernel , 2002, OSDI '02.

[14]  David E. Irwin,et al.  Virtual Machine Hosting for Networked Clusters: Building the Foundations for "Autonomic" Orchestration , 2006, First International Workshop on Virtualization Technology in Distributed Computing (VTDC 2006).

[15]  Irfan Habib,et al.  Virtualization with KVM , 2008 .

[16]  Uwe Schwiegelshohn,et al.  Theory and Practice in Parallel Job Scheduling , 1997, JSSPP.

[17]  Akshat Verma,et al.  Power-aware dynamic placement of HPC applications , 2008, ICS '08.

[18]  Andrew Warfield,et al.  Live migration of virtual machines , 2005, NSDI.

[19]  Dongyan Xu,et al.  Autonomic Live Adaptation of Virtual Computational Environments in a Multi-Domain Infrastructure , 2006, 2006 IEEE International Conference on Autonomic Computing.

[20]  Gerald J. Popek,et al.  Formal requirements for virtualizable third generation architectures , 1974, SOSP '73.

[21]  Frank Yellin,et al.  The Java Virtual Machine Specification , 1996 .

[22]  Joseph Hall,et al.  Algorithms for Data Migration , 2008, Algorithmica.

[23]  Leon Gommans,et al.  Seamless live migration of virtual machines over the MAN/WAN , 2006, Future Gener. Comput. Syst..

[24]  Michael A. Frumkin,et al.  NAS Grid Benchmarks: a tool for Grid space exploration , 2001, Proceedings 10th IEEE International Symposium on High Performance Distributed Computing.

[25]  Mahadev Satyanarayanan,et al.  Internet suspend/resume , 2002, Proceedings Fourth IEEE Workshop on Mobile Computing Systems and Applications.

[26]  E. N. Elnozahy,et al.  Energy-Efficient Server Clusters , 2002, PACS.

[27]  Kartik Gopalan,et al.  Post-copy based live virtual machine migration using adaptive pre-paging and dynamic self-ballooning , 2009, VEE '09.

[28]  Joseph Hall,et al.  On algorithms for efficient data migration , 2001, SODA '01.

[29]  Wesley Emeneker,et al.  Increasing Reliability through Dynamic Virtual Clustering , 2006 .

[30]  Ole Agesen,et al.  A comparison of software and hardware techniques for x86 virtualization , 2006, ASPLOS XII.

[31]  Wu-chun Feng,et al.  Optimizing 10-Gigabit Ethernet for Networks of Workstations, Clusters, and Grids: A Case Study , 2003, International Conference on Software Composition.

[32]  Mark J. Clement,et al.  Core Algorithms of the Maui Scheduler , 2001, JSSPP.

[33]  Jesús Labarta,et al.  Implementing Malleability on MPI Jobs , 2004, IEEE PACT.

[34]  Jerome H. Saltzer,et al.  The protection of information in computer systems , 1975, Proc. IEEE.

[35]  Jeannie R. Albrecht,et al.  Harnessing Virtual Machine Resource Control for Job Management , 2007 .

[36]  Robert M. Haralick,et al.  Increasing Tree Search Efficiency for Constraint Satisfaction Problems , 1979, Artif. Intell..

[37]  Robert J. Creasy,et al.  The Origin of the VM/370 Time-Sharing System , 1981, IBM J. Res. Dev..

[38]  Laurence A. Wolsey,et al.  Integer and Combinatorial Optimization , 1988 .

[39]  Carl A. Waldspurger,et al.  Memory resource management in VMware ESX server , 2002, OSDI '02.

[40]  Howard Frazier,et al.  Gigabit Ethernet: From 100 to 1000 Mbps , 1999, IEEE Internet Comput..

[41]  David E. Irwin,et al.  Sharing Networked Resources with Brokered Leases , 2006, USENIX Annual Technical Conference, General Track.

[42]  Honbo Zhou,et al.  The EASY - LoadLeveler API Project , 1996, JSSPP.

[43]  Alan R. Simon,et al.  Sql: 1999 Understanding Relational Language Components , 2002 .

[44]  Miron Livny,et al.  Checkpoint and Migration of UNIX Processes in the Condor Distributed Processing System , 1997 .

[45]  Peter Desnoyers,et al.  Memory buddies: exploiting page sharing for smart colocation in virtualized data centers , 2009, VEE '09.

[46]  Jason Duell,et al.  Berkeley Lab Checkpoint/Restart (BLCR) for Linux Clusters , 2006 .

[47]  Daniel Price,et al.  Solaris Zones: Operating System Support for Consolidating Commercial Workloads , 2004, LISA.

[48]  Dror G. Feitelson,et al.  Improved Utilization and Responsiveness with Gang Scheduling , 1997, JSSPP.

[49]  George Varghese,et al.  Difference engine , 2010, OSDI.

[50]  Luiz André Barroso,et al.  Web Search for a Planet: The Google Cluster Architecture , 2003, IEEE Micro.

[51]  Karsten Schwan,et al.  VPM tokens: virtual machine-aware power budgeting in datacenters , 2009, Cluster Computing.

[52]  G. C. Buttazzo,et al.  RE: Robust Earliest Deadline Scheduling , 1993 .

[53]  Andrzej Kochut,et al.  Dynamic Placement of Virtual Machines for Managing SLA Violations , 2007, 2007 10th IFIP/IEEE International Symposium on Integrated Network Management.

[54]  Gregory F. Pfister,et al.  Aspects of the InfiniBand architecture , 2001, Proceedings 42nd IEEE Symposium on Foundations of Computer Science.

[55]  David E. Culler,et al.  The ganglia distributed monitoring system: design, implementation, and experience , 2004, Parallel Comput..

[56]  Thomas R. Gross,et al.  Impact of Job Mix on Optimizations for Space Sharing Schedulers , 1996, Proceedings of the 1996 ACM/IEEE Conference on Supercomputing.

[57]  David E. Irwin,et al.  Dynamic virtual clusters in a grid site manager , 2003, High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on.

[58]  Borja Sotomayor,et al.  Capacity Leasing in Cloud Systems using the OpenNebula Engine , 2008 .

[59]  Richard Wolski,et al.  The network weather service: a distributed resource performance forecasting service for metacomputing , 1999, Future Gener. Comput. Syst..

[60]  Beng-Hong Lim,et al.  Virtualizing I/O Devices on VMware Workstation's Hosted Virtual Machine Monitor , 2001, USENIX Annual Technical Conference, General Track.

[61]  Charles L. Seitz,et al.  Myrinet: A Gigabit-per-Second Local Area Network , 1995, IEEE Micro.

[62]  Dror G. Feitelson,et al.  Utilization and Predictability in Scheduling the IBM SP2 with Backfilling , 1998, Proceedings of the First Merged International Parallel Processing Symposium and Symposium on Parallel and Distributed Processing.

[63]  Richard Wolski,et al.  The Eucalyptus Open-Source Cloud-Computing System , 2009, 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid.

[64]  Jeffrey O. Kephart,et al.  The Vision of Autonomic Computing , 2003, Computer.

[65]  Arun Venkataramani,et al.  Black-box and Gray-box Strategies for Virtual Machine Migration , 2007, NSDI.

[66]  Mike Murphy,et al.  The Efficacy of Live Virtual Machine Migrations Over the Internet , 2007, Proceedings of the 2nd International Workshop on Virtualization Technology in Distributed Computing (VTDC '07).

[67]  Larry Rudolph,et al.  Towards Convergence in Job Schedulers for Parallel Supercomputers , 1996, JSSPP.

[68]  Xuxian Jiang,et al.  Virtual distributed environments in a shared infrastructure , 2005, Computer.

[69]  Jerome H. Saltzer,et al.  A hardware architecture for implementing protection rings , 1972, CACM.

[70]  Paul Shaw,et al.  A Constraint for Bin Packing , 2004, CP.

[71]  Marianne Shaw,et al.  Scale and performance in the Denali isolation kernel , 2002, OSDI '02.

[72]  David A. Lifka,et al.  The ANL/IBM SP Scheduling System , 1995, JSSPP.

[73]  Karsten Schwan,et al.  VirtualPower: coordinated power management in virtualized enterprise systems , 2007, SOSP.

[74]  Amin Vahdat,et al.  Usher: An Extensible Framework for Managing Clusters of Virtual Machines , 2007, LISA.

[75]  Chung Laung Liu,et al.  Scheduling Algorithms for Multiprogramming in a Hard-Real-Time Environment , 1989, JACM.

[76]  Ricardo Bianchini,et al.  Dynamic cluster reconfiguration for power and performance , 2003 .

[77]  Dutch T. Meyer,et al.  Parallax: virtual disks for virtual machines , 2008, Eurosys '08.

[78]  Georges Da Costa,et al.  2005 IEEE International Symposium on Cluster Computing and the Grid , 2005, CCGRID.

[79]  Wolfgang Gentzsch,et al.  Sun Grid Engine: towards creating a compute power grid , 2001, Proceedings First IEEE/ACM International Symposium on Cluster Computing and the Grid.

[80]  Thomas L. Sterling,et al.  BEOWULF: A Parallel Workstation for Scientific Computation , 1995, ICPP.

[81]  Dror G. Feitelson,et al.  Packing Schemes for Gang Scheduling , 1996, JSSPP.

[82]  Xiaotie Deng,et al.  Preemptive Scheduling of Parallel Jobs on Multiprocessors , 1996, SIAM J. Comput..

[83]  Robert N. M. Watson,et al.  Jails: confining the omnipotent root , 2000 .

[84]  Larry L. Peterson,et al.  Container-based operating system virtualization: a scalable, high-performance alternative to hypervisors , 2007, EuroSys '07.

[85]  Michael A. Trick A Dynamic Programming Approach for Consistency and Propagation for Knapsack Constraints , 2003, Ann. Oper. Res..

[86]  Dan Tsafrir,et al.  A Short Survey of Commercial Cluster Batch Schedulers , 2005 .

[87]  Edward G. Coffman,et al.  Approximation algorithms for bin packing: a survey , 1996 .

[88]  Emir Imamagic,et al.  Grid infrastructure monitoring system based on Nagios , 2007, GMW '07.

[89]  M. Rosenblum,et al.  Optimizing the migration of virtual computers , 2002, OSDI '02.

[90]  Rajeev Motwani,et al.  The load rebalancing problem , 2006, J. Algorithms.