A decentralized and fault‐tolerant Desktop Grid system for distributed applications

This paper proposes a decentralized and fault‐tolerant software system for the purpose of managing Desktop Grid resources. Its main design principle is to eliminate the need for a centralized server, therefore to remove the single point of failure and bottleneck of existing Desktop Grids. Instead, each node can play alternatively the role of client or server. Our main contribution is to design the PastryGrid protocol (based on Pastry) for Desktop Grid in order to support a wider class of applications, especially the distributed application with precedence between tasks. Compared with a centralized system, we evaluate our approach over 205 machines executing 2500 tasks. The results we obtain show that our decentralized system outperforms XtremWeb‐CH which is configured as a master/slave, with respect to the turnaround time. Copyright © 2009 John Wiley & Sons, Ltd.

[1]  Antony I. T. Rowstron,et al.  Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility , 2001, SOSP.

[2]  Christine Morin,et al.  Vigne: Executing Easily and Efficiently a Wide Range of Distributed Applications in Grids , 2007, Euro-Par.

[3]  Antony I. T. Rowstron,et al.  PAST: a large-scale, persistent peer-to-peer storage utility , 2001, Proceedings Eighth Workshop on Hot Topics in Operating Systems.

[4]  Antony I. T. Rowstron,et al.  Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems , 2001, Middleware.

[5]  Luís Moura Silva,et al.  Using Checkpointing to Enhance Turnaround Time on Institutional Desktop Grids , 2006, 2006 Second IEEE International Conference on e-Science and Grid Computing (e-Science'06).

[6]  P. Oscar Boykin,et al.  IP over P2P: enabling self-configuring virtual IP networks for grid computing , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[7]  Andrew A. Chien,et al.  Resource Management for Rapid Application Turnaround on Enterprise Desktop Grids , 2004, Proceedings of the ACM/IEEE SC2004 Conference.

[8]  Nazareno Andrade,et al.  Labs of the World, Unite!!! , 2006, Journal of Grid Computing.

[9]  David P. Anderson,et al.  BOINC: a system for public-resource computing and storage , 2004, Fifth IEEE/ACM International Workshop on Grid Computing.

[10]  David P. Anderson,et al.  SETI@home: an experiment in public-resource computing , 2002, CACM.

[11]  Daniel Zappala,et al.  Cluster Computing on the Fly : P 2 P Scheduling of Idle Cycles in the Internet , 2004 .

[12]  Cosimo Anglano,et al.  Peer-to-Peer Desktop Grids in the Real World: The ShareGrid Project , 2008, 2008 Eighth IEEE International Symposium on Cluster Computing and the Grid (CCGRID).

[13]  Thomas Hérault,et al.  Computing on large-scale distributed systems: XtremWeb architecture, programming models, security, tests and convergence with grid , 2005, Future Gener. Comput. Syst..

[14]  Andrew A. Chien,et al.  Henri Casanova , 2022 .

[15]  Mark Handley,et al.  A scalable content-addressable network , 2001, SIGCOMM '01.

[16]  Nabil Abdennadher,et al.  Towards a peer-to-peer platform for high performance computing , 2005, Eighth International Conference on High-Performance Computing in Asia-Pacific Region (HPCASIA'05).

[17]  Mohamed Jemni,et al.  PastryGrid: decentralisation of the execution of distributed applications in desktop grid , 2008, MGC '08.

[18]  Peter A. Dinda,et al.  Towards Virtual Networks for Virtual Machine Grid Computing , 2004, Virtual Machine Research and Technology Symposium.

[19]  José A. B. Fortes,et al.  A virtual network (ViNe) architecture for grid computing , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[20]  Gilles Fedak,et al.  XtremWeb: a generic global computing system , 2001, Proceedings First IEEE/ACM International Symposium on Cluster Computing and the Grid.

[21]  Andrew A. Chien,et al.  Entropia: architecture and performance of an enterprise desktop grid system , 2003, J. Parallel Distributed Comput..

[22]  Xuxian Jiang,et al.  VIOLIN: Virtual Internetworking on Overlay Infrastructure , 2004, ISPA.

[23]  Dror G. Feitelson,et al.  The workload on parallel supercomputers: modeling the characteristics of rigid jobs , 2003, J. Parallel Distributed Comput..

[24]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[25]  Wayne Kelly,et al.  G2-P2P: A Fully Decentralised Fault-Tolerant Cycle-Stealing Framework , 2005, ACSW.