A Hierarchical Formal Framework for Adaptive N-variant Programs in Multi-core Systems

We propose a formal framework for designing and developing adaptive N-variant programs. The framework supports multiple levels of fault detection, masking, and recovery through reconfiguration. Our approach is two-fold: we introduce an Adaptive Functional Capability Model (AFCM) to define levels of functional capabilities for each service provided by the system. The AFCM specifies how, once a fault is detected, a system shall scale back its functional capabilities while still maintaining essential services. Next, we propose a Multilayered Assured Architecture Design (MAAD) to implement reconfiguration requirements specified by AFCMs. The layered design improves system resilience in two dimensions: (1) unlike traditional fault-tolerant architectures that treat functional requirements uniformly, each layer of the assured architecture implements a level of functional capability defined in AFCM. The architecture design uses lower-layer functionalities (which are simpler and more reliable) as reference to monitor highlayer functionalities. The layered design also facilitates an orderly system reconfiguration (resulting in graceful degradation) while maintaining essential system services. (2) Each layer of the assured architecture uses N-variant techniques to improve fault detection. The degree of redundancy introduced by Nvariant implementation determines the mix of faults that can be tolerated at each layer. Our hybrid fault model allows us to consider fault types ranging from benign faults to Byzantine faults. Last but not least, multi-layers combined with N-variant implementations are especially suitable for multi-core systems.

[1]  Michael Franz,et al.  Multi-variant Program Execution: Using Multi-core Systems to Defuse Buffer-Overflow Vulnerabilities , 2008, 2008 International Conference on Complex, Intelligent and Software Intensive Systems.

[2]  Algirdas A. Avi The Methodology of N-Version Programming , 1995 .

[3]  Robert Laddaga,et al.  Introduction to Self-adaptive Software: Applications , 2001, IWSAS.

[4]  Roger M. Kieckhafer,et al.  Exploiting Omissive Faults in Synchronous Approximate Agreement , 2000, IEEE Trans. Computers.

[5]  David Evans,et al.  N-Variant Systems: A Secretless Framework for Security through Diversity , 2006, USENIX Security Symposium.

[6]  K. Hiraki,et al.  Heterogeneous Functional Units for High Speed Fault-Tolerant Execution Stage , 2007 .

[7]  Philip M. Thambidurai,et al.  Interactive consistency with multiple failure modes , 1988, Proceedings [1988] Seventh Symposium on Reliable Distributed Systems.

[8]  Robert Rinker,et al.  Resilient multi-core systems: a hierarchical formal model for N-variant executions , 2009, CSIIRW '09.

[9]  Renato J. O. Figueiredo,et al.  Towards Byzantine Fault Tolerance in Many-Core Computing Platforms , 2007, 13th Pacific Rim International Symposium on Dependable Computing (PRDC 2007).

[10]  Jack W. Davidson,et al.  Security through redundant data diversity , 2008, 2008 IEEE International Conference on Dependable Systems and Networks With FTCS and DCC (DSN).