A provenance-aware weighted fault tolerance scheme for service-based applications

Service-orientation has been proposed as a way of facilitating the development and integration of increasingly complex and heterogeneous system components. However, there are many new challenges to the dependability community in this new paradigm, such as how individual channels within fault-tolerant systems may invoke common services as part of their workflow, thus increasing the potential for common-mode failure. We propose a scheme that - for the first time - links the technique of provenance with that of multi-version fault tolerance. We implement a large test system and perform experiments with a single-version system, a traditional MVD system, and a provenance-aware MVD system, and compare their results. We show that for this experiment, our provenance-aware scheme results in a much more dependable system than either of the other systems tested, whilst imposing a negligible timing overhead.

[1]  Michael Luck,et al.  A Protocol for Recording Provenance in Service-Oriented Grids , 2004, OPODIS.

[2]  Luc Moreau,et al.  Recording and Reasoning over Data Provenance in Web and Grid Services , 2003, OTM.

[3]  Huimin Zhao,et al.  Pricing Web Services for Optimizing Resource Allocation – An Implementation Scheme , 2003 .

[4]  Carl E. Landwehr,et al.  Basic concepts and taxonomy of dependable and secure computing , 2004, IEEE Transactions on Dependable and Secure Computing.

[5]  K. Kane The Distributed Recovery Block Scheme , 2022 .

[6]  Algirdas Avizienis,et al.  The N-Version Approach to Fault-Tolerant Software , 1985, IEEE Transactions on Software Engineering.

[7]  Stuart Bennett,et al.  History-based weighted average voter: a novel software voting algorithm for fault-tolerant computer systems , 2001, Proceedings Ninth Euromicro Workshop on Parallel and Distributed Processing.

[8]  Hermann Kopetz,et al.  Fault tolerance, principles and practice , 1990 .

[9]  Jie Xu,et al.  Building dependable software for critical applications: multi-version software versus one good version , 2001, Proceedings Sixth International Workshop on Object-Oriented Real-Time Dependable Systems.

[10]  Kwang-Hae Kim,et al.  Approaches to implementation of a repairable distributed recovery block scheme , 1988, [1988] The Eighteenth International Symposium on Fault-Tolerant Computing. Digest of Papers.