Realizing a fault-tolerant embedded controller on distributed real-time systems

Advances in real-time, embedded and distributed systems along with control and communication theory have catalyzed the rapid emergence of cyber-physical systems such as a self-driving car. The importance of fault-tolerance support on a cyber-physical system (CPS) has been greatly emphasized by recent research due to the nature of CPS that senses its surroundings, processes sensor data, and reacts using its actuators. In order to tackle this challenge, we proposed SAFER (System-level Architecture for Failure Evasion in Real-time Applications) in our previous work. SAFER is able to tolerate fail-stop processor and/or task failures for distributed embedded real-time systems. One of its limitations, however, is that SAFER is not capable of tolerating a failure of a processor with a dedicated connection to an actuator. This paper provides a method that relaxes this limitation by (1) deploying a small piece of hardware to avoid a dedicated connection between a processor and an actuator, (2) adding a software module that monitors and controls the hardware, and (3) enhancing the failure detection and recovery mechanisms of SAFER to support these changes. The detailed implementation and evaluation of the SAFER extension is on-going work.

[1]  Ragunathan Rajkumar,et al.  Portable RK: a portable resource kernel for guaranteed and enforced timing behavior , 1999, Proceedings of the Fifth IEEE Real-Time Technology and Applications Symposium.

[2]  Iain Bate,et al.  Extending a Task Allocation Algorithm for Graceful Degradation of Real-Time Distributed Embedded Systems , 2008, 2008 Real-Time Systems Symposium.

[3]  Gaurav Bhatia,et al.  SAFER: System-level Architecture for Failure Evasion in Real-time Applications , 2012, 2012 IEEE 33rd Real-Time Systems Symposium.

[4]  William Whittaker,et al.  Autonomous driving in urban environments: Boss and the Urban Challenge , 2008, J. Field Robotics.

[5]  Purnendu Sinha Architectural design and reliability analysis of a fail-operational brake-by-wire system from ISO 26262 perspectives , 2011, Reliab. Eng. Syst. Saf..

[6]  Weidong Xiang,et al.  Automobile Brake-by-Wire Control System Design and Analysis , 2008, IEEE Transactions on Vehicular Technology.

[7]  Aniruddha S. Gokhale,et al.  Middleware for Resource-Aware Deployment and Configuration of Fault-Tolerant Real-time Systems , 2010, 2010 16th IEEE Real-Time and Embedded Technology and Applications Symposium.

[8]  William Whittaker,et al.  Autonomous driving in urban environments: Boss and the Urban Challenge , 2008, J. Field Robotics.

[9]  Seyed Masoud Sadjadi,et al.  A Survey of Adaptive Middleware , 2003 .

[10]  Kenneth P. Birman Replication and fault-tolerance in the ISIS system , 1985, SOSP 1985.

[11]  Aniruddha S. Gokhale,et al.  Adaptive Failover for Real-Time Middleware with Passive Replication , 2009, 2009 15th IEEE Real-Time and Embedded Technology and Applications Symposium.

[12]  Rolf Isermann,et al.  Fault-tolerant drive-by-wire systems , 2002 .

[13]  George J. Vachtsevanos,et al.  Software technology for implementing reusable, distributed control systems , 2003 .

[14]  Insup Lee,et al.  Cyber-physical systems: The next computing revolution , 2010, Design Automation Conference.

[15]  Matthew McNaughton,et al.  Software Infrastructure for an Autonomous Ground Vehicle , 2008, J. Aerosp. Comput. Inf. Commun..

[16]  Tudor Dumitras,et al.  MEAD: support for Real‐Time Fault‐Tolerant CORBA , 2005, Concurr. Pract. Exp..

[17]  Gaurav Bhatia,et al.  Model-Based Development of Embedded Systems: The SysWeaver Approach , 2006, 12th IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS'06).