On Disconnection Node Failure and Stochastic Static Resilience of P2P Communication Networks

There exist a large number of graph optimization problems in the literature, which arise in network design and analysis. Our objective in this paper is to highlight the disconnection probability which can arise in interconnect networks of large-scale parallel processors systems. Although traditional measures of fault-tolerance such as reliability and availability are applicable to such systems, these measures were designed mostly for mission-oriented applications or repairable systems. They fail to account for the high redundancy levels typical in peer-to-peer (P2P) communication and distributed systems. For these systems, new measures have been introduced that can evaluate the capability of a system for graceful degradation. In the design of such systems, one of the most fundamental considerations is the reliability of their interconnected networks, which can be usually characterized by connectivity of the topological structure of the network. In this paper, we analyze the problem of network disconnection in the context of large-scale P2P networks and understand how static patterns of node failure affect the resilience of such networks. Simulation results based on the network topology confirm the validity of the analytical approximation and demonstrate the localizer efficiency.

[1]  Klaus Sutner,et al.  The Complexity of the Residual Node Connectedness Reliability Problem , 1991, SIAM J. Comput..

[2]  José Duato,et al.  Segment-based routing: an efficient fault-tolerant routing algorithm for meshes and tori , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[3]  Suresh Chalasani,et al.  Fault-Tolerant Wormhole Routing Algorithms for Mesh Networks , 1995, IEEE Trans. Computers.

[4]  Robert Morris,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM 2001.

[5]  Laxmikant V. Kalé,et al.  A fault tolerant protocol for massively parallel systems , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[6]  John P. Hayes,et al.  A Graph Model for Fault-Tolerant Computing Systems , 1976, IEEE Transactions on Computers.

[7]  Dmitri Loguinov,et al.  On static and dynamic partitioning behavior of large-scale networks , 2005, 13TH IEEE International Conference on Network Protocols (ICNP'05).

[8]  Béla Bollobás,et al.  Random Graphs: Notation , 2001 .

[9]  Young-Joo Suh,et al.  Software-Based Rerouting for Fault-Tolerant Pipelined Communication , 2000, IEEE Trans. Parallel Distributed Syst..

[10]  David R. Karger,et al.  Koorde: A Simple Degree-Optimal Distributed Hash Table , 2003, IPTPS.

[11]  Béla Bollobás,et al.  Random Graphs , 1985 .

[12]  Krishna P. Gummadi,et al.  The impact of DHT routing geometry on resilience and proximity , 2003, SIGCOMM '03.

[13]  Guy W. Zimmerman,et al.  A New Approach to System-Level Fault-Tolerance in Message-Passing MultiComputers , 1989, Great Lakes Computer Science Conference.

[14]  Ion Stoica,et al.  Peer-to-Peer Systems II , 2003, Lecture Notes in Computer Science.

[15]  Janak H. Patel,et al.  Fault-Tolerant Computing: An Overview , 1991 .

[16]  Amos Fiat,et al.  Censorship resistant peer-to-peer content addressable networks , 2002, SODA '02.

[17]  Junming Xu Topological Structure and Analysis of Interconnection Networks , 2002, Network Theory and Applications.

[18]  Anne-Marie Kermarrec,et al.  Network awareness and failure resilience in self-organizing overlay networks , 2003, 22nd International Symposium on Reliable Distributed Systems, 2003. Proceedings..

[19]  Jean-Luc Gaudiot,et al.  Network Resilience: A Measure of Network Fault Tolerance , 1990, IEEE Trans. Computers.

[20]  Algirdas Avizienis Faulty-Tolerant Computing: An Overview , 1971, Computer.

[21]  Dmitri Loguinov,et al.  On Lifetime-Based Node Failure and Stochastic Resilience of Decentralized Peer-to-Peer Networks , 2005, IEEE/ACM Transactions on Networking.

[22]  Antonio Robles,et al.  A transition-based fault-tolerant routing methodology for InfiniBand networks , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[23]  B. Bollobás The evolution of random graphs , 1984 .

[24]  Antonio Robles,et al.  A routing methodology for achieving fault tolerance in direct networks , 2006, IEEE Transactions on Computers.