Characterizing Result Errors in Internet Desktop Grids

Desktop grids use the free resources in Intranet and Internet environments for large-scale computation and storage. While desktop grids offer a high return on investment, one critical issue is the validation of results returned by participating hosts. Several mechanisms for result validation have been previously proposed. However, the characterization of errors is poorly understood. To study error rates, we implemented and deployed a desktop grid application across several thousand hosts distributed over the Internet. We then analyzed the results to give quantitative and empirical characterization of errors stemming from input or output (I/O) failures. We find that in practice, error rates are widespread across hosts but occur relatively infrequently. Moreover, we find that error rates tend to not be stationary over time nor correlated between hosts. In light of these characterization results, we evaluated state-of-the-art error detection mechanisms and describe the trade-offs for using each mechanism.

[1]  Satoshi Hirano,et al.  Bayanihan: building and studying web-based volunteer computing systems using Java , 1999, Future Gener. Comput. Syst..

[2]  David J. Goodman,et al.  Personal Communications , 1994, Mobile Communications.

[3]  Daniel Nurmi,et al.  Quantifying Machine Availability in Networked and Desktop Grid Systems , 2004 .

[4]  Gilles Fedak,et al.  XtremWeb: a generic global computing system , 2001, Proceedings First IEEE/ACM International Symposium on Cluster Computing and the Grid.

[5]  H. O. Hartley,et al.  Universal Bounds for Mean Range and Extreme Observation , 1954 .

[6]  Andrew A. Chien,et al.  Entropia: architecture and performance of an enterprise desktop grid system , 2003, J. Parallel Distributed Comput..

[7]  Gilles Fedak,et al.  XtremLab: A System for Characterizing Internet Desktop Grids , 2006, 2006 15th IEEE International Conference on High Performance Distributed Computing.

[8]  Luis F. G. Sarmenta Sabotage-tolerance mechanisms for volunteer computing systems , 2002, Future Gener. Comput. Syst..

[9]  David P. Anderson,et al.  Homogeneous redundancy: a technique to ensure integrity of molecular simulation results using public computing , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[10]  David P. Anderson,et al.  BOINC: a system for public-resource computing and storage , 2004, Fifth IEEE/ACM International Workshop on Grid Computing.

[11]  Luís Moura Silva,et al.  Validating Desktop Grid Results By Comparing Intermediate Checkpoints , 2006, CoreGRID Integration Workshop.

[12]  Luis F. G. Sarmenta,et al.  Volunteer Computing , 1996 .

[13]  Shanyu Zhao,et al.  Result Verification and Trust-based Scheduling in Open Peer-to-Peer Cycle Sharing Systems , 2004 .

[14]  T. Aven Upper (lower) bounds on the mean of the maximum (minimum) of a number of random variables , 1985, Journal of Applied Probability.

[15]  Gilles Fedak,et al.  The Computational and Storage Potential of Volunteer Computing , 2006, Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGRID'06).

[16]  Yaohang Li,et al.  Improving performance via computational replication on a large-scale computational grid , 2003, CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings..

[17]  Robert Morris,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM 2001.

[18]  Andrew A. Chien,et al.  Henri Casanova , 2022 .

[19]  Hector Garcia-Molina,et al.  The Eigentrust algorithm for reputation management in P2P networks , 2003, WWW '03.

[20]  Donald E. Eastlake,et al.  US Secure Hash Algorithm 1 (SHA1) , 2001, RFC.

[21]  Miron Livny,et al.  Condor-a hunter of idle workstations , 1988, [1988] Proceedings. The 8th International Conference on Distributed.

[22]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[23]  Miron Livny,et al.  The Available Capacity of a Privately Owned Workstation Environmont , 1991, Perform. Evaluation.