Resource aggregation for fault tolerance in integrated services networks

For several real-time applications it is critical that the failure of a network component does not lead to unexpected termination or long disruption of service. In this paper, we propose a scheme called RAFT (Resource Aggregation for Fault Tolerance) that guarantees recovery in a timely and resource-efficient manner. RAFT is presented in the framework of the Reliable Back-bone (RBone), a virtual network layered on top of an integrated services network. Applications can request fault tolerance against RBone link and node failures. The basic idea of RAFT is to setup every fault tolerant flow along a secondary path that serves as a backup in case the primary path fails. The secondary path resource reservations are aggregated whenever possible to reduce the overhead of providing fault tolerance. We show that the RSVP resource reservation protocol can support RAFT with simple extensions.

[1]  Deepinder P. Sidhu,et al.  Finding disjoint paths in networks , 1991, SIGCOMM '91.

[2]  Kang G. Shin,et al.  Fault-tolerant real-time communication in distributed computing systems , 1992, [1992] Digest of Papers. FTCS-22: The Twenty-Second International Symposium on Fault-Tolerant Computing.

[3]  Anindo Banerjea,et al.  Fault Management for Realtime Networks , 1994 .

[4]  Subrahmanyam Dravida,et al.  Fast restoration of ATM networks , 1994, IEEE J. Sel. Areas Commun..

[5]  Ken-ichi Sato,et al.  Self-healing ATM networks based on virtual path concept , 1994, IEEE J. Sel. Areas Commun..

[6]  John C. McDonald,et al.  Public network integrity-avoiding a crisis in trust , 1994, IEEE J. Sel. Areas Commun..

[7]  Anindo Banerjea,et al.  Recovering guaranteed performance service connections from single and multiple faults , 1994, 1994 IEEE GLOBECOM. Communications: The Global Bridge.

[8]  Anindo Banerjea Fault management for real time networks , 1995 .

[9]  P. A. Veitch,et al.  A comparison of pre-planned routing techniques for virtual path restoration , 1995, Modelling and Evaluation of ATM Networks.

[10]  Tohru Kikuno,et al.  A routing protocol for finding two node-disjoint paths in computer networks , 1995, Proceedings of International Conference on Network Protocols.

[11]  David D. Clark,et al.  The design philosophy of the DARPA internet protocols , 1988, SIGCOMM '88.

[12]  Yakov Rekhter,et al.  A Border Gateway Protocol 4 (BGP-4) , 1994, RFC.

[13]  K. R. Krishnan,et al.  Improved survivability with multi-layer dynamic routing , 1995 .

[14]  Niraj K. Jha,et al.  Fault-tolerant computer system design , 1996, IEEE Parallel & Distributed Technology: Systems & Applications.

[15]  Anindo Banerjea Simulation Study of the Capacity Effects of Dispersity Routing for Fault Tolerant Realtime Channels , 1996, SIGCOMM.

[16]  Kazutaka Murakami,et al.  Virtual path routing for survivable ATM networks , 1996, TNET.

[17]  Scott Shenker,et al.  Specification of Guaranteed Quality of Service , 1997, RFC.

[18]  Lixia Zhang,et al.  Resource ReSerVation Protocol (RSVP) - Version 1 Functional Specification , 1997, RFC.

[19]  R.A. Guerin,et al.  QoS path management with RSVP , 1997, GLOBECOM 97. IEEE Global Telecommunications Conference. Conference Record.

[20]  D. Richard Kuhn,et al.  Sources of Failure in the Public Switched Telephone Network , 1997, Computer.

[21]  Kang G. Shin,et al.  Fast restoration of real-time communication service from component failures in multi-hop networks , 1997, SIGCOMM '97.

[22]  Peter L. Higginson,et al.  Development of Router Clusters to Provide Fast Failover in IP Networks , 1998, Digit. Tech. J..