A Light-weight Multilevel Recoverable Container for Event-driven System: A Self-healing CPS Approach

Cyber Physical Systems (CPS) is regarded as a new technological revolution, which tightly integrates computing, communication, and control technologies, to build a kind of smart networked distributed embedded control system. CPS is designed to interact autonomously with the volatile external environment, which implies that the requirements is constantly changing during run-time. So guaranteeing the reliability of system becomes extremely difficult. Flexible self-healing mechanisms are needed urgently to improve the reliability and availability of CPS. This paper presents a light-weight container-based virtualization for event-driven CPS. By providing a unique run-time stack for each application, the container isolates faults and limits the effect of failures. Furthermore, a multilevel fault detection and recovery method is integrated to protect applications and to limit the fault propagation. And the analysis shows the container has very low memory footprint (939 bytes) and constant performance overhead. Also the testing manifests that the multilevel recovery is high reliable on WCET violation failure recovery even if the application is not well designed or malicious. Keywords-CPS; Container-based virtualization; SelfHealing reliability; Availability; Fault Detection

[1]  Chuan Li,et al.  Condroid: A Container-Based Virtualization Solution Adapted for Android Devices , 2015, 2015 3rd IEEE International Conference on Mobile Cloud Computing, Services, and Engineering.

[2]  Eduardo de la Torre,et al.  A dynamically adaptable bus architecture for trading-off among performance, consumption and dependability in Cyber-Physical Systems , 2014, 2014 24th International Conference on Field Programmable Logic and Applications (FPL).

[3]  Edward A. Lee Computing needs time , 2009, CACM.

[4]  Gernot Heiser,et al.  The role of virtualization in embedded systems , 2008, IIES '08.

[5]  Soumik Sarkar,et al.  Scalable Anomaly Detection and Isolation in Cyber-Physical Systems Using Bayesian Networks , 2014 .

[6]  W. Cholewa,et al.  Fault Diagnosis: Models, Artificial Intelligence, Applications , 2004 .

[7]  Leon Wu,et al.  FARE: A framework for benchmarking reliability of cyber-physical systems , 2013, 2013 IEEE Long Island Systems, Applications and Technology Conference (LISAT).

[8]  Frank Vahid,et al.  A Survey on Concepts, Applications, and Challenges in Cyber-Physical Systems , 2014, KSII Trans. Internet Inf. Syst..

[9]  Daniel Raho,et al.  KVM, Xen and Docker: A performance analysis for ARM based NFV and cloud computing , 2015, 2015 IEEE 3rd Workshop on Advances in Information, Electronic and Electrical Engineering (AIEEE).

[10]  Ankur Agarwal,et al.  Mobile Virtualization Technologies , 2014 .

[11]  Daniel Jackson,et al.  A direct path to dependable software , 2009, CACM.

[12]  Minyi Guo,et al.  Scheduling Co-Design for Reliability and Energy in Cyber-Physical Systems , 2013, IEEE Transactions on Emerging Topics in Computing.

[13]  Gail E. Kaiser,et al.  An Autonomic Reliability Improvement System for Cyber-Physical Systems , 2012, 2012 IEEE 14th International Symposium on High-Assurance Systems Engineering.

[14]  Sy-Yen Kuo,et al.  Dependable Architecture of RFID Middleware on Networked RFID Systems , 2013, 2013 IEEE International Conference on Green Computing and Communications and IEEE Internet of Things and IEEE Cyber, Physical and Social Computing.

[15]  Edward A. Lee The Past, Present and Future of Cyber-Physical Systems: A Focus on Models , 2015, Sensors.

[16]  Ioannis G. Askoxylakis,et al.  A Pattern-Based Approach for Designing Reliable Cyber-Physical Systems , 2014, GLOBECOM 2014.

[17]  Mikel Azkarate-askasua,et al.  WCET analysis methods: Pitfalls and challenges on their trustworthiness , 2015, 10th IEEE International Symposium on Industrial Embedded Systems (SIES).

[18]  Ragunathan Rajkumar,et al.  Realizing a fault-tolerant embedded controller on distributed real-time systems , 2013, SIGBED.

[19]  Brice Morin,et al.  A dynamic component model for cyber physical systems , 2012, CBSE '12.

[20]  Jakob Engblom,et al.  The worst-case execution-time problem—overview of methods and survey of tools , 2008, TECS.

[21]  Nageswara S. V. Rao On Undecidability Aspects of Resilient Computations and Implications to Exascale , 2014, Euro-Par Workshops.

[22]  L. Miclea,et al.  An agent-oriented approach for cyber-physical system with dependability features , 2012, Proceedings of 2012 IEEE International Conference on Automation, Quality and Testing, Robotics.

[23]  A. Wander,et al.  Innovative Fault Detection, Isolation and Recovery Strategies On-Board Spacecraft: State of the Art and Research Challenges , 2013 .

[24]  Ramakrishnan Rajamony,et al.  An updated performance comparison of virtual machines and Linux containers , 2015, 2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[25]  Lui Sha,et al.  Using Simplicity to Control Complexity , 2001, IEEE Softw..