System unavailability analysis based on window-observed recurrent event data

Many service industries require high level of system availability to be competitive. An appropriate system unavailability metric is important for business decisions and to minimize the operation risks. In practice, a system can be unavailable for service because of multiple types of events, and the durations of these events can also vary. In addition, the data that record the system operating history often have a complicated structure. In this paper, we develop a framework for estimating system unavailability metric based on historical data of a fleet of heavy-duty industry equipment, which we call System A. During the useful life of System A, repairs and maintenance actions are performed. However, not all repairs or maintenance actions were recorded. Specifically, the information on event times, types, and durations is available only for certain time intervals i.e., observation windows, instead of the entire useful life span of the system. Thus, the data structure is window-observed recurrent event with multiple event types. We use a nonhomogeneous Poisson process model with a bathtub intensity function to describe the recurrent events, and a truncated lognormal distribution to describe the event durations. We then define a conservative metric for system unavailability, obtain an estimate of this metric, and quantify the statistical uncertainty. Copyright © 2013 John Wiley & Sons, Ltd.

[1]  Eric R. Ziegel,et al.  Statistical Methods for the Reliability of Repairable Systems , 2001, Technometrics.

[2]  Marvin Rausand,et al.  System Reliability Theory: Models, Statistical Methods, and Applications , 2003 .

[3]  Kishor S. Trivedi,et al.  System availability with non-exponentially distributed outages , 2002, IEEE Trans. Reliab..

[4]  Wayne Nelson,et al.  Confidence Limits for Recurrence Data—Applied to Cost or Number of Product Repairs , 1995 .

[5]  Gianpaolo Pulcini,et al.  Modeling the failure data of a repairable equipment with bathtub type failure intensity , 2001, Reliab. Eng. Syst. Saf..

[6]  Yuan-Shun Dai,et al.  Availability Modeling and Cost Optimization for the Grid Resource Management System , 2008, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[7]  William Q. Meeker,et al.  Analysis of Window-Observation Recurrence Data , 2008, Technometrics.

[8]  Yada Zhu,et al.  Availability optimization of systems subject to competing risk , 2010, Eur. J. Oper. Res..

[9]  Nalini Ravishanker,et al.  NHPP models for categorized software defects , 2005 .

[10]  J. Kalbfleisch,et al.  The Statistical Analysis of Failure Time Data: Kalbfleisch/The Statistical , 2002 .

[11]  J. Kalbfleisch,et al.  The Statistical Analysis of Failure Time Data , 1980 .

[12]  Richard J. Cook,et al.  The Statistical Analysis of Recurrent Events , 2007 .

[13]  J. Bert Keats,et al.  Statistical Methods for Reliability Data , 1999 .

[14]  John D. Kalbfleisch,et al.  The Statistical Analysis of Failure Data , 1986, IEEE Transactions on Reliability.

[15]  M. Crowder Classical Competing Risks , 2001 .

[16]  Henryk Maciejewski,et al.  Estimation of repairable system availability within fixed time horizon , 2008, Reliab. Eng. Syst. Saf..

[17]  W. Nelson Statistical Methods for Reliability Data , 1998 .

[18]  William Q. Meeker,et al.  Recurrent Events Data Analysis for Product Repairs, Disease Recurrences, and Other Applications , 2003, Technometrics.

[19]  Javier M. Moguerza,et al.  Bayesian Reliability, Availability, and Maintainability Analysis for Hardware Systems Described Through Continuous Time Markov Chains , 2010, Technometrics.

[20]  Maurizio Guida,et al.  Reliability Analysis of Mechanical Systems With Bounded and Bathtub Shaped Intensity Function , 2009, IEEE Transactions on Reliability.

[21]  T. P. Ryan,et al.  System Reliability Theory: Models, Statistical Methods, and Applications, Second Edition , 2005 .

[22]  Daniel R. Jeske Estimating the cumulative downtime distribution of a highly reliable component , 1996, IEEE Trans. Reliab..