Techniques for software implemented fault tolerance

Software implemented node level fault tolerance is an important technique for meeting dependability requirements in embedded safety critical systems. This thesis deals with both the issues of implementing mechanisms for fault tolerance and their validation. It is demonstrated how reusable and highly configurable fault tolerance mechanisms can be built using aspect oriented C++ and Java. The study shows that aspect oriented programming (AOP) is well suited for implementing both systematic and application specific mechanisms for node level fault tolerance. Hence a single framework for all node level fault tolerance mechanisms can be obtained. It is also shown how systematic mechanisms can be limited to only cover critical parts of the software and thereby reduce runtime overhead. Since the fault tolerance code becomes completely separated from the primary function code, AOP makes it possible to build easily applicable and reusable fault tolerance components. A framework of such components is presented and evaluated. This thesis also investigates the feasibility of emulating source code software faults directly in Java byte code for validation purposes. Experimental results show that software defects introduced in source code can be emulated in Java byte code with a high level of confidence. This makes it possible to validate the dependability of Java programs with respect to realistic software defects embedded within COTS components without the need to know the source code.