Reputation-Based Scheduling on Unreliable Distributed Infrastructures

This paper presents a design and analysis of scheduling techniques to cope with the inherent unreliability and instability of worker nodes in large-scale donation-based distributed infrastructures such as P2P and Grid systems. In particular, we focus on nodes that execute tasks via donated computational resources and may behave erratically or maliciously. We present a model in which reliability is not a binary property but a statistical one based on a node’s prior performance and behavior. We use this model to construct several reputation-based scheduling algorithms that employ estimated reliability ratings of worker nodes for efficient task allocation. Through simulation of a BOINC-like distributed computing infrastructure, we demonstrate that our algorithms can significantly improve throughput, while maintaining a very high success rate of task completion.

[1]  Hector Garcia-Molina,et al.  The Eigentrust algorithm for reputation management in P2P networks , 2003, WWW '03.

[2]  David P. Anderson,et al.  BOINC: a system for public-resource computing and storage , 2004, Fifth IEEE/ACM International Workshop on Grid Computing.

[3]  Daniel Zappala,et al.  Cluster Computing on the Fly : P 2 P Scheduling of Idle Cycles in the Internet , 2004 .

[4]  Chris GauthierDickey,et al.  Result verification and trust-based scheduling in peer-to-peer grids , 2005, Fifth IEEE International Conference on Peer-to-Peer Computing (P2P'05).

[5]  C. Prahalad,et al.  The Future of Competition: Co-Creating Unique Value With Customers , 2004 .

[6]  Andrew A. Chien,et al.  Entropia: architecture and performance of an enterprise desktop grid system , 2003, J. Parallel Distributed Comput..

[7]  Kostas G. Anagnostakis,et al.  Exchange-based incentive mechanisms for peer-to-peer file sharing , 2004, 24th International Conference on Distributed Computing Systems, 2004. Proceedings..

[8]  R. S. Laundy,et al.  Multiple Criteria Optimisation: Theory, Computation and Application , 1989 .

[9]  Miguel Oom Temudo de Castro,et al.  Practical Byzantine fault tolerance , 1999, OSDI '99.

[10]  Philippe Golle,et al.  Secure Distributed Computing in a Commercial Environment , 2002, Financial Cryptography.

[11]  Ernesto Damiani,et al.  A reputation-based approach for choosing reliable resources in peer-to-peer networks , 2002, CCS '02.

[12]  Philippe Golle,et al.  Uncheatable Distributed Computations , 2001, CT-RSA.

[13]  Seungjoon Lee,et al.  Cooperative peer groups in NICE , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).

[14]  Shanyu Zhao,et al.  Result Verification and Trust-based Scheduling in Open Peer-to-Peer Cycle Sharing Systems , 2004 .

[15]  Amin Vahdat,et al.  SHARP: an architecture for secure resource peering , 2003, SOSP '03.

[16]  Miguel Castro,et al.  Practical byzantine fault tolerance and proactive recovery , 2002, TOCS.

[17]  David P. Anderson,et al.  SETI@home: an experiment in public-resource computing , 2002, CACM.

[18]  Jon B. Weissman,et al.  A quantitative comparison of reputation systems in the grid , 2005, The 6th IEEE/ACM International Workshop on Grid Computing, 2005..

[19]  David E. Culler,et al.  PlanetLab: an overlay testbed for broad-coverage services , 2003, CCRV.

[20]  Amin Vahdat,et al.  Bootstrapping a Distributed Computational Economy with Peer-to-Peer Bartering , 2003 .

[21]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[22]  Paul Resnick,et al.  Reputation systems , 2000, CACM.

[23]  A. Charnes,et al.  Goal programming and multiple objective optimizations: Part 1 , 1977 .

[24]  Luis F. G. Sarmenta Sabotage-tolerance mechanisms for volunteer computing systems , 2002, Future Gener. Comput. Syst..

[25]  Karl Aberer,et al.  Managing trust in a peer-2-peer information system , 2001, CIKM '01.

[26]  Abhishek Chandra,et al.  Adaptive Reputation-Based Scheduling on Unreliable Distributed Infrastructures , 2007, IEEE Transactions on Parallel and Distributed Systems.

[27]  Muthucumaru Maheswaran,et al.  Integrating trust into grid resource management systems , 2002, Proceedings International Conference on Parallel Processing.

[28]  Suresh Jagannathan,et al.  Unstructured peer-to-peer networks for sharing processor cycles , 2006, Parallel Comput..

[29]  Wenliang Du,et al.  Uncheatable grid computing , 2004, 24th International Conference on Distributed Computing Systems, 2004. Proceedings..

[30]  Seungjoon Lee,et al.  Cooperative peer groups in NICE , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).

[31]  Ian Foster,et al.  The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.