Dependable distributed systems

Distributed software systems are the basis for many innovative applications. The key for achieving scalable and maintainable distributed systems is dependability, because otherwise the complexity of distribution would leave the system uncontrollable. Hence, our approach aims at a concept for optimizing dependability. Similar to other approaches we use replication as means to provide transparent fault-tolerance and persistence, but we especially focus on increasing availability by relaxing data integrity by using a mixture of asynchronous and synchronous replication techniques. This work contributes three main aspects: First, a description of the envisioned trade-off between availability and consistency, secondly with a mechanism to achieve this trade-off, and thirdly, with models that use this mechanism and can be transparently deployed by developers. This work aims at enabling a configurable and application-specific optimum of availability, possibly even controlled during runtime. A real-life telecommunication application serves as proof of concept.

[1]  R. Marculescu,et al.  Ready to ware , 2003, IEEE Spectrum.

[2]  Robbert van Renesse,et al.  Horus: a flexible group communication system , 1996, CACM.

[3]  Amin Vahdat,et al.  Design and evaluation of a conit-based continuous consistency model for replicated services , 2002, TOCS.

[4]  Michael K. Reiter,et al.  Persistent objects in the Fleet system , 2001, Proceedings DARPA Information Survivability Conference and Exposition II. DISCEX'01.

[5]  Toru Nakamura WHITE PAPER, European transport policy for 2010 : time to decide , 2004 .

[6]  W. Vogels,et al.  The Horus and Ensemble projects: accomplishments and limitations , 2000, Proceedings DARPA Information Survivability Conference and Exposition. DISCEX'00.

[7]  Wanlei Zhou,et al.  Replication Techniques in Distributed Systems , 1999, Scalable Comput. Pract. Exp..

[8]  Gustavo Alonso,et al.  Understanding replication in databases and distributed systems , 2000, Proceedings 20th IEEE International Conference on Distributed Computing Systems.

[9]  Kenneth P. Birman,et al.  The process group approach to reliable distributed computing , 1992, CACM.

[10]  Priya Narasimhan,et al.  The Eternal system: an architecture for enterprise applications , 1999, Proceedings Third International Enterprise Distributed Object Computing. Conference (Cat. No.99EX366).