Self-organizing communication-aware resource management for scheduling in grid environment

Rapid advances in network and computer technologies are making networked computers, organized in the form of grid, an appealing vehicle for cost-effective parallel computing. But how to handle efficiently the communications in scheduling is still a main obstacle to using these resources. In this paper, we tackle this problem by partitioning resources into groups in a parallel and distributed fashion. Resources with good communication performance to each other are clustered into a same group. Based on our observation that communication latencies between adjacent resources are much less than those between non-adjacent ones with high possibility, flooding with a small TTL (time-to-live) can inherently exploit the proximity property between resources, which improves greatly the efficiency of our partitioning work. Our distributed resource management method can fit well for environments with large-scale resources such as grid

[1]  Pierre-François Dutot,et al.  Models for scheduling on large scale platforms: which policy for which application? , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[2]  Miron Livny,et al.  A worldwide flock of Condors: Load sharing among workstation clusters , 1996, Future Gener. Comput. Syst..

[3]  Rajesh Raman,et al.  High-throughput resource management , 1998 .

[4]  Ibrahim Matta,et al.  BRITE: an approach to universal topology generation , 2001, MASCOTS 2001, Proceedings Ninth International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems.

[5]  H. Casanova,et al.  Combined Selection and Binding for Competitive Resource Environments , 2005 .

[6]  Ami Marowka,et al.  The GRID: Blueprint for a New Computing Infrastructure , 2000, Parallel Distributed Comput. Pract..

[7]  Ian Foster,et al.  The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.

[8]  Rajkumar Buyya,et al.  High Performance Cluster Computing: Architectures and Systems , 1999 .

[9]  Carl Kesselman,et al.  A Network Performance Tool for Grid Environments , 1999, ACM/IEEE SC 1999 Conference (SC'99).

[10]  Henri Casanova,et al.  A decoupled scheduling approach for Grid application development environments , 2003, J. Parallel Distributed Comput..

[11]  Jon B. Weissman Gallop: The Benefits of Wide-Area Computing for Parallel Processing , 1998, J. Parallel Distributed Comput..

[12]  Song Jiang,et al.  LightFlood: an efficient flooding scheme for file search in unstructured peer-to-peer systems , 2003, 2003 International Conference on Parallel Processing, 2003. Proceedings..

[13]  Leslie G. Valiant,et al.  A bridging model for parallel computation , 1990, CACM.

[14]  Ian T. Foster,et al.  Grid information services for distributed resource sharing , 2001, Proceedings 10th IEEE International Symposium on High Performance Distributed Computing.

[15]  Warren Smith,et al.  A Resource Management Architecture for Metacomputing Systems , 1998, JSSPP.

[16]  Anukool Lakhina,et al.  BRITE: Universal Topology Generation from a User''s Perspective , 2001 .

[17]  Richard Wolski,et al.  The network weather service: a distributed resource performance forecasting service for metacomputing , 1999, Future Gener. Comput. Syst..