Estimating Reliability Trends for the World ' s Fastest Computer

Los Alarnos National Laboratory is home to the World’s fastest computer-Blue Mountain. This machine was created by parallelizing “desktop” computers, To determine whether or not this type of architecture represents the future of super-computing, reliability must be estimated. This paper presents and analyzes failure data of Blue Mountain. Non-homogeneous Poisson processes are fit to the data within a Bayesian hierarchical framework. The task of selecting hyperparameters is discussed, and Bayes factors are used to compare models.

[1]  Bernard W. Silverman,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[2]  L. Wasserman,et al.  Computing Bayes Factors by Combining Simulation and Asymptotic Approximations , 1997 .

[3]  J. T. Duane Learning Curve Approach to Reliability Monitoring , 1964, IEEE Transactions on Aerospace.

[4]  G. Pulcini,et al.  Bayes inference for a non-homogeneous Poisson process with power intensity law (reliability) , 1989 .

[5]  S. K. Lee,et al.  Some Results on Inference for the Weibull Process , 1978 .

[6]  R. Calabria,et al.  Bayes estimation of prediction intervals for a power law process , 1990 .

[7]  T. Aven,et al.  Some tests for comparing reliability growth/deterioration rates of repairable systems , 1989 .

[8]  J. Bert Keats,et al.  Statistical Methods for Reliability Data , 1999 .

[9]  Larry H. Crow,et al.  Reliability Analysis for Complex, Repairable Systems , 1975 .

[10]  P. J. Green,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[11]  Steven E. Rigdon,et al.  STATISTICAL INFERENCE FOR A MODULATED POWER LAW PROCESS , 1996 .

[12]  Peter Green,et al.  Markov chain Monte Carlo in Practice , 1996 .

[13]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[14]  J. M. Finkelstein Confidence Bounds on the Parameters of the Weibull Process , 1976 .

[15]  J. J. Higgins,et al.  A Quasi-Bayes Estimate of the Failure Intensity of a Reliability-Growth Model , 1981, IEEE Transactions on Reliability.

[16]  Mark Berman Inhomogeneous and modulated gamma processes , 1981 .

[17]  Ananda Sen,et al.  Estimation of current reliability in a Duane-based reliability growth model , 1998 .

[18]  Benjamin Reiser,et al.  Bayesian inference for the power law process , 1992 .

[19]  Bev Littlewood,et al.  A Bayesian Reliability Growth Model for Computer Software , 1973 .