Effects of Soft Error to System Reliability

Soft errors on hardware could affect the reliability of computer system. To estimate system reliability, it is important to know the effects of soft errors to system reliability. This paper explores the effects of soft errors to computer system reliability. We propose a new approach to measure system reliability for soft error factor. In our approach, hardware components reliability is concerned first. Then, system reliability which shows the ability to perform required function is concerned. We equal system reliability to software reliability based on the mechanism that soft errors affect system reliability. We build a software reliability model under soft errors condition. In our software model, we analyze the state of software combining with the state of hardware. For program errors which are resulted from soft errors, we give an analysis of error mask. These real errors which could lead to software failure are distinguished. Finally, our experiments illustrate our analyses and validate our approach.

[1]  John D. Musa,et al.  Software reliability - measurement, prediction, application , 1987, McGraw-Hill series in software engineering and technology.

[2]  M.A. Friedman,et al.  Reliability techniques for combined hardware/software systems , 1992, Annual Reliability and Maintainability Symposium 1992 Proceedings.

[3]  E. Normand Single event upset at ground level , 1996 .

[4]  Michael R. Lyu,et al.  Handbook of software reliability engineering , 1996 .

[5]  Paola Velardi,et al.  Hardware-Related Software Errors: Measurement and Analysis , 1985, IEEE Transactions on Software Engineering.

[6]  Sanjay J. Patel,et al.  Characterizing the effects of transient faults on a high-performance processor pipeline , 2004, International Conference on Dependable Systems and Networks, 2004.

[7]  Barry W. Johnson,et al.  Reliability modeling of hardware/software systems , 1995 .

[8]  Daniel R. Jeske,et al.  Reliability Modeling of Hardware and Software Interactions, and Its Applications , 2006, IEEE Transactions on Reliability.

[9]  Xiaodong Li,et al.  Soft Error Modeling and Analysis for Microprocessors , 2008 .

[10]  book,et al.  Computer Architecture , a Quantitative Approach , 1995 .

[11]  Bev Littlewood,et al.  Software reliability and dependability: a roadmap , 2000, ICSE '00.

[12]  Arun K. Somani,et al.  Soft error sensitivity characterization for microprocessor dependability enhancement strategy , 2002, Proceedings International Conference on Dependable Systems and Networks.

[13]  David I. August,et al.  Software-controlled fault tolerance , 2005, TACO.

[14]  John D. Musa,et al.  Software reliability measurement , 1984, J. Syst. Softw..

[15]  Shubu Mukherjee,et al.  Architecture Design for Soft Errors , 2008 .

[16]  David I. August,et al.  Software modulated fault tolerance , 2008 .