Design, implementation, and evaluation of highly available distributed call processing systems

This paper presents the design of a highly available distributed call processing system and its implementation on a local area network of commercial, off-the-shelf workstations. A major challenge of using off-the-shelf components is meeting the strict performance and availability requirements in place for existing public telecommunications systems in a cost-effective manner. Traditional checkpointing and message logging schemes for general distributed applications are not directly applicable since call processing applications built using these schemes suffer from high failure-free overhead and long recovery delays. We propose an application-level fault-tolerance scheme that takes advantage of general properties of distributed call processing systems to avoid message logging and to limit checkpointing overhead. The proposed scheme, applied to a call processing system for wireless networks, shows average call setup latencies of 180 ms, failover times of less than three seconds, and recovery times of less than seventeen seconds. System availability is estimated to be 0.99995. The results indicate that using our proposed scheme meets the above challenge.

[1]  Ramachandran Ramjee,et al.  Distributed call processing for personal communications services , 1995, IEEE Commun. Mag..

[2]  A.R. Modarressi,et al.  Signaling System No.7: a tutorial , 1990, IEEE Communications Magazine.

[3]  G. Zorpette,et al.  Keeping the phone lines open , 1989, IEEE Spectrum.

[4]  Yennun Huang,et al.  Software Fault Tolerance in the Application Layer , 1995 .

[5]  Robert E. Strom,et al.  Optimistic recovery in distributed systems , 1985, TOCS.

[6]  Yi-Min Wang,et al.  Why optimistic message logging has not been used in telecommunications systems , 1995, Twenty-Fifth International Symposium on Fault-Tolerant Computing. Digest of Papers.

[7]  Pankaj Jalote,et al.  Fault tolerance in distributed systems , 1994 .

[8]  A. Fleischmann Distributed Systems , 1994, Springer Berlin Heidelberg.

[9]  S. K. Srivastava,et al.  Structuring Call Control Software Using Distributed Objects , 1996, TreDS.

[10]  Sape J. Mullender,et al.  Distributed systems (2nd Ed.) , 1993 .