Architecting Dependable Systems with Proactive Fault Management
暂无分享,去创建一个
[1] Flaviu Cristian,et al. Atomic Broadcast: From Simple Message Diffusion to Byzantine Agreement , 1995, Inf. Comput..
[2] Kishor S. Trivedi,et al. Analysis and implementation of software rejuvenation in cluster systems , 2001, SIGMETRICS '01.
[3] Isabelle Guyon,et al. An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..
[4] Ricardo Vilalta,et al. A Perspective View and Survey of Meta-Learning , 2002, Artificial Intelligence Review.
[5] Tong Liu,et al. Availability prediction and modeling of high mobility OSCAR cluster , 2003, 2003 Proceedings IEEE International Conference on Cluster Computing.
[6] Miroslaw Malek,et al. Call Availability Prediction in a Telecommunication System: A Data Driven Empirical Approach , 2006, 2006 25th IEEE Symposium on Reliable Distributed Systems (SRDS'06).
[7] Miroslaw Malek. In Search of Real Data on Faults, Errors and Failures , 2006, 2006 Sixth European Dependable Computing Conference.
[8] George Candea,et al. Improving availability with recursive microreboots: a soft-state system case study , 2004, Perform. Evaluation.
[9] William Farr,et al. Software reliability modeling survey , 1996 .
[10] Miroslaw Malek,et al. On tolerating faults in naturally redundant algorithms , 1991, [1991] Proceedings Tenth Symposium on Reliable Distributed Systems.
[11] Joseph L. Hellerstein,et al. Predictive algorithms in the management of computer systems , 2002, IBM Syst. J..
[12] Ulf Westberg,et al. Maintenance scheduling under age replacement policy using proportional hazards model and TTT-plotting , 1997 .
[13] Michèle Basseville,et al. Detection of abrupt changes: theory and application , 1993 .
[14] Attila Csenki. Bayes predictive analysis of a fundamental software reliability model , 1990 .
[15] Daniel P. Siewiorek,et al. Reliable computer systems - design and evaluation (3. ed.) , 1992 .
[16] Kishor S. Trivedi,et al. Analysis of Preventive Maintenance in Transactions Based Software Systems , 1998, IEEE Trans. Computers.
[17] Ramendra K. Sahoo,et al. Evaluating cooperative checkpointing for supercomputing systems , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.
[18] Ting-Ting Yao Lin. Design and evaluation of an on-line predictive diagnostic system , 1988 .
[19] Miroslaw Malek,et al. A survey of online failure prediction methods , 2010, CSUR.
[20] Michael Tortorella,et al. Reliability Theory: With Applications to Preventive Maintenance , 2001, Technometrics.
[21] A. Avizienis,et al. Dependable computing: From concepts to design diversity , 1986, Proceedings of the IEEE.
[22] Peter A. Flach. The Geometry of ROC Space: Understanding Machine Learning Metrics through ROC Isometrics , 2003, ICML.
[23] Zhiling Lan,et al. Exploit failure prediction for adaptive fault-tolerance in cluster computing , 2006, Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGRID'06).
[24] Laxmikant V. Kale,et al. Proactive Fault Tolerance in Large Systems , 2004 .
[25] Tadashi Dohi,et al. Analysis of software cost models with rejuvenation , 2000, Proceedings. Fifth IEEE International Symposium on High Assurance Systems Engineering (HASE 2000).
[26] Joseph L. Hellerstein,et al. Using Control Theory to Achieve Service Level Objectives In Performance Management , 2001, 2001 IEEE/IFIP International Symposium on Integrated Network Management Proceedings. Integrated Network Management VII. Integrated Management Strategies for the New Millennium (Cat. No.01EX470).
[27] Kishor S. Trivedi,et al. Adaptive software rejuvenation: degradation model and rejuvenation scheme , 2003, 2003 International Conference on Dependable Systems and Networks, 2003. Proceedings..
[28] Günther A. Hoffmann,et al. Failure prediction in complex computer systems: a probabilistic approach , 2006 .
[29] Daniel P. Siewiorek,et al. Error log analysis: statistical modeling and heuristic trend analysis , 1990 .
[30] S. Scott,et al. A Failure Predictive and Policy-Based High Availability Strategy for Linux High Performance Computing Cluster , 2004 .
[31] Luís Moura Silva,et al. Deterministic Models of Software Aging and Optimal Rejuvenation Schedules , 2007, 2007 10th IFIP/IEEE International Symposium on Integrated Network Management.
[32] Rajeev Thakur,et al. A Meta-Learning Failure Predictor for Blue Gene/L Systems , 2007, 2007 International Conference on Parallel Processing (ICPP 2007).
[33] Kishor S. Trivedi,et al. A comprehensive model for software rejuvenation , 2005, IEEE Transactions on Dependable and Secure Computing.
[34] Bruno Cernuschi-Frías,et al. A nonparametric nonstationary procedure for failure prediction , 2002, IEEE Trans. Reliab..
[35] Martin D. Buhmann,et al. Radial Basis Functions: Theory and Implementations: Preface , 2003 .
[36] Kishor S. Trivedi,et al. Proactive management of software aging , 2001, IBM J. Res. Dev..
[37] A structured approach to the selection of condition based maintenance , 1997 .
[38] David Lorge Parnas,et al. Software aging , 1994, Proceedings of 16th International Conference on Software Engineering.
[39] L. Alvisi,et al. A Survey of Rollback-Recovery Protocols , 2002 .
[40] R. W. King,et al. Model-based nuclear power plant monitoring and fault detection: Theoretical foundations , 1997 .
[41] Santosh K. Shrivastava,et al. Reliable Computer Systems , 1985, Texts and Monographs in Computer Science.
[42] Cristina Nita-Rotaru,et al. A survey of attack and defense techniques for reputation systems , 2009, CSUR.
[43] Felix Salfner,et al. Event-based Failure Prediction: An Extended Hidden Markov Model Approach , 2008, Ausgezeichnete Informatikdissertationen.
[44] Daniel P. Siewiorek,et al. Reliable computer systems (2nd ed.): design and evaluation , 1992 .
[45] Kishor S. Trivedi,et al. A methodology for detection and estimation of software aging , 1998, Proceedings Ninth International Symposium on Software Reliability Engineering (Cat. No.98TB100257).
[46] Carl E. Landwehr,et al. Basic concepts and taxonomy of dependable and secure computing , 2004, IEEE Transactions on Dependable and Secure Computing.
[47] Ravishankar K. Iyer,et al. Recognition of Error Symptoms in Large Systems , 1986, FJCC.
[48] Huaglory Tianfield,et al. A concise introduction to autonomic computing , 2005, Adv. Eng. Informatics.
[49] Haw Ching Yang,et al. Application Cluster Service Scheme for Near-Zero-Downtime Services , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.
[50] L. McLaughlin,et al. Optimal design of a condition-based maintenance model , 2004, Annual Symposium Reliability and Maintainability, 2004 - RAMS.
[51] Dorothy M. Andrews,et al. A Methodology for Analysis of Failure Prediction Data , 1985, RTSS.
[52] David H. Wolpert,et al. Stacked generalization , 1992, Neural Networks.
[53] P. J. Gardner. A transportation of ALGOL68C , 1977 .
[54] Petr Jan Horn,et al. Autonomic Computing: IBM's Perspective on the State of Information Technology , 2001 .
[55] Yennun Huang,et al. Software rejuvenation: analysis, module and applications , 1995, Twenty-Fifth International Symposium on Fault-Tolerant Computing. Digest of Papers.
[56] Brian Randell,et al. System structure for software fault tolerance , 1975, IEEE Transactions on Software Engineering.
[57] Kishor S. Trivedi,et al. A Best Practice Guide to Resource Forecasting for Computing Systems , 2007, IEEE Transactions on Reliability.
[58] P. M. Melliar-Smith,et al. Software reliability: The role of programmed exception handling , 1977, Language Design for Reliable Software.
[59] Miroslaw Malek,et al. The consensus problem in fault-tolerant computing , 1993, CSUR.
[60] David A. Patterson,et al. Embracing Failure: A Case for Recovery-Oriented Computing (ROC) , 2001 .
[61] Mira Kajko-Mattsson,et al. Can we learn anything from hardware preventive maintenance? , 2001, Proceedings Seventh IEEE International Conference on Engineering of Complex Computer Systems.
[62] George Candea,et al. Automatic failure-path inference: a generic introspection technique for Internet applications , 2003, Proceedings the Third IEEE Workshop on Internet Applications. WIAPP 2003.
[63] Brian Randell,et al. Reliability Issues in Computing System Design , 1978, CSUR.
[64] V. Kulkarni. Modeling and Analysis of Stochastic Systems , 1996 .
[65] J R Pinkert,et al. Reliable computer systems. , 1993, Journal of AHIMA.
[66] David Sinreich,et al. An architectural blueprint for autonomic computing , 2006 .
[67] Ram Chillarege,et al. Early warning of failures through alarm analysis a case study in telecom voice mail systems , 2003, 14th International Symposium on Software Reliability Engineering, 2003. ISSRE 2003..
[68] Daniel P. Siewiorek,et al. Reliable Computer Systems: Design and Evaluation, Third Edition , 1998 .
[69] Kishor S. Trivedi,et al. The fundamentals of software aging , 2008, 2008 IEEE International Conference on Software Reliability Engineering Workshops (ISSRE Wksp).
[70] Jean-Claude Laprie,et al. Software reliability and system reliability , 1996 .
[71] C. R. Cassady,et al. Characterization of optimal age-replacement policies , 1998, Annual Reliability and Maintainability Symposium. 1998 Proceedings. International Symposium on Product Quality and Integrity.
[72] Kishor S. Trivedi,et al. Fighting bugs: remove, retry, replicate, and rejuvenate , 2007, Computer.
[73] Tadashi Dohi,et al. Statistical non-parametric algorithms to estimate the optimal software rejuvenation schedule , 2000, Proceedings. 2000 Pacific Rim International Symposium on Dependable Computing.
[74] Martin D. Buhmann,et al. Radial Basis Functions , 2021, Encyclopedia of Mathematical Geosciences.