Impact of Failure Prediction on Availability: Modeling and Comparative Analysis of Predictive and Reactive Methods
暂无分享,去创建一个
[1] Xubin He,et al. Failure Prediction Models for Proactive Fault Tolerance within Storage Systems , 2008, 2008 IEEE International Symposium on Modeling, Analysis and Simulation of Computers and Telecommunication Systems.
[2] Bran Selic,et al. A Proactive Fault Tolerance Approach to High Performance Computing (HPC) in the Cloud , 2012, 2012 Second International Conference on Cloud and Green Computing.
[3] Christian Engelmann,et al. Proactive fault tolerance for HPC with Xen virtualization , 2007, ICS '07.
[4] Wenbing Zhao,et al. Proactive Service Migration for Long-Running Byzantine Fault Tolerant Systems , 2008, IET Softw..
[5] Kishor S. Trivedi,et al. SHARPE at the age of twenty two , 2009, PERV.
[6] Ming Mao,et al. A Performance Study on the VM Startup Time in the Cloud , 2012, 2012 IEEE Fifth International Conference on Cloud Computing.
[7] Miroslaw Malek,et al. Proactive fault handling for system availability enhancement , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.
[8] Dong Seong Kim,et al. Sensitivity Analysis of Server Virtualized System Availability , 2012, IEEE Transactions on Reliability.
[9] Ikhwan Lee,et al. Survey of Error and Fault Detection Mechanisms , 2011 .
[10] Juan Manuel García,et al. A survey of migration mechanisms of virtual machines , 2014, CSUR.
[11] Miroslaw Malek,et al. A survey of online failure prediction methods , 2010, CSUR.
[12] Felix Salfner,et al. Dependable Estimation of Downtime for Virtual Machine Live Migration , 2012 .
[13] Jordi Torres,et al. Adaptive on-line software aging prediction based on machine learning , 2010, 2010 IEEE/IFIP International Conference on Dependable Systems & Networks (DSN).
[14] Nitin H. Vaidya,et al. Impact of Checkpoint Latency on Overhead Ratio of a Checkpointing Scheme , 1997, IEEE Trans. Computers.
[15] Ravishankar K. Iyer,et al. Measurement and modeling of computer reliability as affected by system activity , 1986, TOCS.
[16] Chokchai Leangsuksun,et al. Proficiency Metrics for Failure Prediction in High Performance Computing , 2010, International Symposium on Parallel and Distributed Processing with Applications.
[17] Carl E. Landwehr,et al. Basic concepts and taxonomy of dependable and secure computing , 2004, IEEE Transactions on Dependable and Secure Computing.
[18] Miroslaw Malek,et al. Call Availability Prediction in a Telecommunication System: A Data Driven Empirical Approach , 2006, 2006 25th IEEE Symposium on Reliable Distributed Systems (SRDS'06).
[19] Bianca Schroeder,et al. A Large-Scale Study of Failures in High-Performance Computing Systems , 2006, IEEE Transactions on Dependable and Secure Computing.
[20] Stephen L. Scott,et al. Evaluation of fault-tolerant policies using simulation , 2007, 2007 IEEE International Conference on Cluster Computing.
[21] Christian Engelmann,et al. A Framework for Proactive Fault Tolerance , 2008, 2008 Third International Conference on Availability, Reliability and Security.
[22] Miroslaw Malek,et al. Optimizing Failure Prediction to Maximize Availability , 2016, 2016 IEEE International Conference on Autonomic Computing (ICAC).
[23] Paulo Romero Martins Maciel,et al. Availability study on cloud computing environments: Live migration as a rejuvenation mechanism , 2013, 2013 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).
[24] David M. Nicol,et al. Fluid stochastic Petri nets: Theory, applications, and solution techniques , 1998, Eur. J. Oper. Res..
[25] Yves Robert,et al. Checkpointing Strategies with Prediction Windows , 2013, 2013 IEEE 19th Pacific Rim International Symposium on Dependable Computing.
[26] Felix Salfner,et al. Timely Virtual Machine Migration for Pro-active Fault Tolerance , 2011, 2011 14th IEEE International Symposium on Object/Component/Service-Oriented Real-Time Distributed Computing Workshops.
[27] W. Kent Fuchs,et al. An adaptive checkpointing protocol to bound recovery time with message logging , 1999, Proceedings of the 18th IEEE Symposium on Reliable Distributed Systems.
[28] Franck Cappello,et al. Improving the Computing Efficiency of HPC Systems Using a Combination of Proactive and Preventive Checkpointing , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.
[29] Zhiling Lan,et al. Adaptive Fault Management of Parallel Applications for High-Performance Computing , 2008, IEEE Transactions on Computers.
[30] Jack J. Dongarra,et al. Exascale computing and big data , 2015, Commun. ACM.