Simple locality-aware co-allocation in peer-to-peer supercomputing

With current grid middleware, it is difficult to deploy distributed supercomputing applications that run concurrently on multiple resources. As current grid middleware systems have problems with co-allocation (scheduling across multiple grid sites), fault-tolerance and are difficult to set-up and maintain, we consider an alternative: peer-to-peer (P2P) supercomputing. P2P supercomputing middleware systems overcome many limitations of current grid systems. However, the lack of central components makes scheduling on P2P systems inherently difficult. As a possible scheduling solution for P2P supercomputing middleware we introduce flood scheduling. It is locality aware, decentralized, flexible and supports co-allocation. We introduce Zorilla, a prototype P2P supercomputing middleware system. Evaluation of Zorilla on over 600 processors at six sites of the Grid5000 system shows that flood scheduling, when used in a P2P network with suitable properties, is a good alternative to centralized algorithms.

[1]  Denis Caromel,et al.  A High Performance Java Middleware with a Real Application , 2004, Proceedings of the ACM/IEEE SC2004 Conference.

[2]  Denis Caromel,et al.  Balancing active objects on a peer to peer infrastructure , 2005, XXV International Conference of the Chilean Computer Science Society (SCCC'05).

[3]  Jason Maassen,et al.  Fault-tolerance, malleability and migration for divide-and-conquer applications on the grid , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[4]  Thomas Hérault,et al.  Computing on large-scale distributed systems: XtremWeb architecture, programming models, security, tests and convergence with grid , 2005, Future Gener. Comput. Syst..

[5]  Jason Maassen,et al.  Ibis: an efficient Java-based grid programming environment , 2002, JGI '02.

[6]  R. Wolski,et al.  GridSAT: A Chaff-based Distributed SAT Solver for the Grid , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[7]  Ben Y. Zhao,et al.  Locality Aware Mechanisms for Large-scale Networks , 2002 .

[8]  Dick H. J. Epema,et al.  Experiences with the KOALA co-allocating scheduler in multiclusters , 2005, CCGrid 2005. IEEE International Symposium on Cluster Computing and the Grid, 2005..

[9]  John Kubiatowicz,et al.  Handling churn in a DHT , 2004 .