Reliability of Non-Coherent Warm Standby Systems With Reworking

In this paper we model and analyze non-repairable 1-out-of- N : G warm standby systems subject to periodic backups and dynamic reworking. Particularly, in such systems, a standby element must redo some portion of already performed work by the failed online element before taking over the mission task, which makes the actual mission time dynamic. The considered systems are widely used in applications such as computing and manufacturing, but have not been well studied in reliability theory. In this work, we make new contributions by suggesting a numerical algorithm to evaluate the reliability of the considered warm standby systems. It is revealed that these systems are non-coherent, where the system reliability has non-monotonic dependence on the reliability of individual elements. Numerical examples further show that the non-coherency phenomenon is more distinguished for elements initiated earlier than those initiated later in the warm standby list. Example results also imply that placing highly unreliable elements at the end of the warm standby waiting list, or even removing them from the system planning, can enhance the reliability of a warm standby system subject to reworking. Findings from this work can guide the reliability design of the considered warm standby systems in practice.

[1]  Juan Eloy Ruiz-Castro,et al.  A complex discrete warm standby system with loss of units , 2012, Eur. J. Oper. Res..

[2]  Algirdas Avizienis,et al.  A Unified Reliability Model for Fault-Tolerant Computers , 1980, IEEE Transactions on Computers.

[3]  Marc Bouissou,et al.  A new formalism that combines advantages of fault-trees and Markov models: Boolean logic driven Markov processes , 2003, Reliab. Eng. Syst. Saf..

[4]  S. Contini,et al.  About the Definition of Coherency in Binary System Reliability Analysis , 1980 .

[5]  Liudong Xing,et al.  Reliability Analysis of Nonrepairable Cold-Standby Systems Using Sequential Binary Decision Diagrams , 2012, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[6]  Qin Zhang,et al.  Reliability Analysis for a Real Non-Coherent System , 1987, IEEE Transactions on Reliability.

[7]  Tieling Zhang,et al.  Availability of 3-out-of-4:G Warm Standby System , 2000 .

[8]  Barry W. Johnson Design & analysis of fault tolerant digital systems , 1988 .

[9]  Kishor S. Trivedi,et al.  Investigating dynamic reliability and availability through state-space models , 2012, Comput. Math. Appl..

[10]  Toshiyuki Inagaki,et al.  Probabilistic Evaluation of Prime Implicants and Top-Events for Non-Coherent Systems , 1980, IEEE Transactions on Reliability.

[11]  Francesco Longo,et al.  Symbolic Representation Techniques in Dynamic Reliability Evaluation , 2010, 2010 IEEE 12th International Symposium on High Assurance Systems Engineering.

[12]  Kishor S. Trivedi,et al.  Performability Analysis of Multistate Computing Systems Using Multivalued Decision Diagrams , 2010, IEEE Transactions on Computers.

[13]  Hoang Pham Optimal design for a class of noncoherent systems , 1991 .

[14]  Min Xie,et al.  Availability and reliability of k-out-of-(M+N): G warm standby systems , 2006, Reliab. Eng. Syst. Saf..

[15]  Yun Zhou,et al.  The Reliability Wall for Exascale Supercomputing , 2012, IEEE Transactions on Computers.

[16]  Gregory Levitin,et al.  Cold-standby sequencing optimization considering mission cost , 2013, Reliab. Eng. Syst. Saf..

[17]  Barry Johnson,et al.  Fault Tolerant Computer System for the A129 Helicopter , 1985, IEEE Transactions on Aerospace and Electronic Systems.

[18]  N. Limnios,et al.  Semi-Markov Processes and Reliability , 2012 .

[19]  J. Janssen,et al.  Semi-Markov Risk Models for Finance, Insurance and Reliability , 2007 .

[20]  Kishor S. Trivedi,et al.  Markov renewal theory applied to performability evaluation , 1996 .

[21]  Michael Pecht,et al.  Reliability of a k-out-of-n warm-standby system , 1992 .

[22]  Chung-Chi Hsieh,et al.  Reliability and cost optimization in distributed computing systems , 2003, Comput. Oper. Res..

[23]  Michael G. Pecht,et al.  A Highly Accurate Method for Assessing Reliability of Redundant Arrays of Inexpensive Disks (RAID) , 2009, IEEE Transactions on Computers.

[24]  Dong Liu,et al.  Quantification of Cut Sequence Set for Fault Tree Analysis , 2007, HPCC.

[25]  Peter S. Jackson On the s-Importance of Elements and Prime Implicants of Non-Coherent Systems , 1983, IEEE Transactions on Reliability.

[26]  G. Apostolakis,et al.  Methods forProbabilistic Analysis ofNoncoherent Fault Trees , 1980 .

[27]  John D. Andrews,et al.  Importance measures for noncoherent-system analysis , 2003, IEEE Trans. Reliab..

[28]  Shie-Shien Yang,et al.  Optimal simple step-stress plan for cumulative exposure model using log-normal distribution , 2005, IEEE Transactions on Reliability.

[29]  David G. Robinson,et al.  An algorithmic approach to increased reliability through standby redundancy , 1989 .

[30]  J. D. Andrews To not or not to not , 2000 .

[31]  S. Amari,et al.  Computing Failure Frequency of Noncoherent Systems , 2006 .

[32]  Liudong Xing,et al.  Reliability analysis of warm standby systems using sequential BDD , 2011, 2011 Proceedings - Annual Reliability and Maintainability Symposium.

[33]  Hoang Pham,et al.  Reliability Characteristics of k-out-of-n Warm Standby Systems , 2012, IEEE Trans. Reliab..

[34]  Ewan Macarthur,et al.  Accelerated Testing: Statistical Models, Test Plans, and Data Analysis , 1990 .

[35]  Tao Hu,et al.  Reliability optimization model of standby phased-mission systems based on BDD , 2010, 2010 IEEE International Conference on Industrial Engineering and Engineering Management.

[36]  Peter S. Jackson Comment on "Probabilistic Evaluation of Prime Implicants and Top-Events for Non-Coherent Systems , 1982 .

[37]  Tao Hu,et al.  Redundancy optimization of standby phased-mission systems , 2010, 2010 International Conference on Intelligent Computing and Integrated Systems.

[38]  George Kokolakis,et al.  Reliability analysis of a two-unit general parallel system with (n-2) warm standbys , 2010, Eur. J. Oper. Res..

[39]  Hoang Pham,et al.  Analysis of Noncoherent Systems and an Architecture for the Computation of the System Reliability , 1993, IEEE Trans. Computers.

[40]  Hoang Pham Optimal system-profit design of k-to-l-out-of-n systems , 1992 .

[41]  Liudong Xing,et al.  A fast approximation method for reliability analysis of cold-standby systems , 2012, Reliab. Eng. Syst. Saf..

[42]  Chuen-Horng Lin,et al.  A redundant repairable system with imperfect coverage and fuzzy parameters , 2008 .

[43]  Elmer Phibbs,et al.  Fault-tree Analysis , 1974 .

[44]  Qin Zhang,et al.  Element Importance and System Failure Frequency of a 2-State System , 1985, IEEE Transactions on Reliability.

[45]  D. Pandey,et al.  Reliability analysis of a powerloom plant with cold standby for its strategic unit , 1996 .

[46]  Gregory Levitin,et al.  Minimum Mission Cost Cold-Standby Sequencing in Non-Repairable Multi-Phase Systems , 2014, IEEE Transactions on Reliability.

[47]  K. Xie,et al.  Recognizing the Reliability Non-coherence Components of Multiple Parallel Transmission Lines , 2011 .

[48]  Suprasad V. Amari,et al.  Redundancy optimization problem with warm-standby redundancy , 2010, 2010 Proceedings - Annual Reliability and Maintainability Symposium (RAMS).

[49]  Hoang Pham Cost optimization of a class of noncoherent systems , 1991 .

[50]  Saudi Arabia,et al.  Efficient Computation of k-to-/-out-of-n System Reliability , 1987 .

[51]  Emanuele Borgonovo,et al.  The reliability importance of components and prime implicants in coherent and non-coherent systems including total-order interactions , 2010, Eur. J. Oper. Res..

[52]  Kuo-Hsiung Wang,et al.  Simulation inferences for an availability system with general repair distribution and imperfect fault coverage , 2010, Simul. Model. Pract. Theory.

[53]  Lawrence S. Kroll Mathematica--A System for Doing Mathematics by Computer. , 1989 .

[54]  Kishor S. Trivedi,et al.  MARKOV REGENERATIVE PROCESS IN SHARPE , 1999 .

[55]  A. Bossche The top-event's failure frequency for non-coherent multi-state fault trees , 1984 .

[56]  R. Ramaswami,et al.  Book Review: Design and Analysis of Fault-Tolerant Digital Systems , 1990 .

[57]  Pham Hoang,et al.  Tampered Failure Rate Load-Sharing Systems: Status and Perspectives , 2008 .

[58]  Gregory Levitin,et al.  Optimal sequencing of warm standby elements , 2013, Comput. Ind. Eng..

[59]  Zhi-Jie Pan,et al.  A new method to calculate the failure frequency of noncoherent systems , 1990 .

[60]  Gregory Levitin,et al.  Mission Cost and Reliability of 1-out-of- $N$ Warm Standby Systems With Imperfect Switching Mechanisms , 2014, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[61]  Kishor S. Trivedi,et al.  Computing steady-state mean time to failure for non-coherent repairable systems , 2005, IEEE Transactions on Reliability.

[62]  L. Camarinopoulos,et al.  Failure frequencies of non-coherent structures , 1993 .

[63]  Francesco Longo,et al.  Availability Assessment of HA Standby Redundant Clusters , 2010, 2010 29th IEEE Symposium on Reliable Distributed Systems.