Towards the Integration of Fault, Resource, and Power Management

The runtime management of faults, resources and power have been traditionally dissociated in the development of embedded software systems. Each of them copes with unsolicited events of different types (i.e. errors, resource saturation and power alarms), and operates independently of the other two. In this paper we study the case of using alarm events generated by the resource and power management mechanisms to trigger a graceful degradation mechanism that is otherwise used for fault management purposes. The occurrence of an unsolicited event is reported to the graceful degradation mechanism, which removes from the running system those parts necessary to eliminate, or lessen, the source of the event, while at the same time allow the system to deliver the basic functionality for each of the tasks that it runs. As a consequence, the reliability, robustness, and performance qualities of the system under study improved significantly, for a negligible increase in the complexity of the graceful degradation mechanism.

[1]  Dhiraj K. Pradhan,et al.  Organization and analysis of a gracefully-degrading interleaved memory system , 1987, ISCA '87.

[2]  Maurice Herlihy,et al.  Specifying Graceful Degradation , 1991, IEEE Trans. Parallel Distributed Syst..

[3]  Fred B. Schneider,et al.  Inexact agreement: accuracy, precision, and graceful degradation , 1985, PODC '85.

[4]  Kang G. Shin,et al.  Optimal reconfiguration strategy for a degradable multimodule computing system , 1987, JACM.

[5]  Khalid Sayood,et al.  A robust coding scheme for packet video , 1992, IEEE Trans. Commun..

[6]  Vladimir Cherkassky A measure of graceful degradation in parallel-computer systems , 1989 .

[7]  Titos Saridakis Graceful Degradation for Component-Based Embedded Software , 2004, IASSE.

[8]  J. Bormans,et al.  3D computational graceful degradation , 2000, 2000 IEEE International Symposium on Circuits and Systems. Emerging Technologies for the 21st Century. Proceedings (IEEE Cat No.00CH36353).

[9]  Arthur D. Friedman,et al.  Diagnostic And Computational Reconfiguration In Multiprocessor Systems , 1978, ACM Annual Conference.

[10]  Valérie Issarny,et al.  A dynamic reconfiguration service for CORBA , 1998, Proceedings. Fourth International Conference on Configurable Distributed Systems (Cat. No.98EX159).

[11]  Niraj K. Jha,et al.  Graceful Degradation in Algorithm-Based Fault Tolerant Multiprocessor Systems , 1997, IEEE Trans. Parallel Distributed Syst..

[12]  Sajal K. Das,et al.  A Prioritized Real-Time Wireless Call Degradation Framework for Optimal Call Mix Selection , 2002, Mob. Networks Appl..

[13]  Alexander Thomasian,et al.  A design study of a shared resource computing system , 1976, ISCA.

[14]  Paul M. Chau,et al.  Robust image transmission over CDMA channels , 2000, 2000 Digest of Technical Papers. International Conference on Consumer Electronics. Nineteenth in the Series (Cat. No.00CH37102).

[15]  Sam Toueg,et al.  Fault-tolerant wait-free shared objects , 1992, Proceedings., 33rd Annual Symposium on Foundations of Computer Science.

[16]  Garth A. Gibson,et al.  RAID: high-performance, reliable secondary storage , 1994, CSUR.