Towards Proactive Fault Management of Enterprise Systems
暂无分享,去创建一个
[1] Zhiling Lan,et al. Adaptive Fault Management of Parallel Applications for High-Performance Computing , 2008, IEEE Transactions on Computers.
[2] Jorge-Arnulfo Quiané-Ruiz,et al. RAFTing MapReduce: Fast recovery on the RAFT , 2011, 2011 IEEE 27th International Conference on Data Engineering.
[3] Gabor Karsai,et al. Application of software health management techniques , 2011, SEAMS '11.
[4] Transparent Fault Tolerance of Device Drivers for Virtual Machines , 2010, IEEE Transactions on Computers.
[5] José A. B. Fortes,et al. Fault Management in Map-Reduce Through Early Detection of Anomalous Nodes , 2013, ICAC.
[6] Vincenzo Grassi,et al. The KlaperSuite framework for model-driven reliability analysis of component-based systems , 2014, Software & Systems Modeling.
[7] Christian Engelmann,et al. Proactive fault tolerance for HPC with Xen virtualization , 2007, ICS '07.
[8] Aurelien Bouteiller,et al. Fault Tolerance Management for a Hierarchical GridRPC Middleware , 2008, 2008 Eighth IEEE International Symposium on Cluster Computing and the Grid (CCGRID).
[9] Alessandra Gorla,et al. Automatic recovery from runtime failures , 2013, 2013 35th International Conference on Software Engineering (ICSE).
[10] Zhiling Lan,et al. Toward Automated Anomaly Identification in Large-Scale Systems , 2010, IEEE Transactions on Parallel and Distributed Systems.
[11] José A. B. Fortes,et al. Towards self-caring mapreduce: Proactively reducing fault-induced execution-time penalties , 2011, 2011 International Conference on High Performance Computing & Simulation.
[12] Xiaohui Gu,et al. UBL: unsupervised behavior learning for predicting performance anomalies in virtualized cloud systems , 2012, ICAC '12.
[13] Zhi-Li Zhang,et al. Co-designing the failure analysis and monitoring of large-scale systems , 2008, PERV.
[14] Pan Pan,et al. Dynamic Workflow Management and Monitoring Using DDS , 2010, 2010 Seventh IEEE International Conference and Workshops on Engineering of Autonomic and Autonomous Systems.
[15] Nir Friedman,et al. Bayesian Network Classifiers , 1997, Machine Learning.
[16] Dirk Beyer,et al. Designing for Disasters , 2004, FAST.
[17] Franck Cappello,et al. Optimization of Multi-level Checkpoint Model for Large Scale HPC Applications , 2014, 2014 IEEE 28th International Parallel and Distributed Processing Symposium.
[18] Qing Li,et al. FACTS: A Framework for Fault-Tolerant Composition of Transactional Web Services , 2010, IEEE Transactions on Services Computing.
[19] Michael D. Bond,et al. Tolerating memory leaks , 2008, OOPSLA.
[20] Ricardo J. Rodríguez,et al. Fault-tolerant techniques and security mechanisms for model-based performance prediction of critical systems , 2012, ISARCS '12.
[21] Ching-Hsien Hsu,et al. On improvement of cloud virtual machine availability with virtualization fault tolerance mechanism , 2011, 2011 IEEE Third International Conference on Cloud Computing Technology and Science.
[22] Indranil Gupta,et al. Making cloud intermediate data fault-tolerant , 2010, SoCC '10.
[23] Jeffrey O. Kephart,et al. The Vision of Autonomic Computing , 2003, Computer.
[24] Jack Y. B. Lee. Supporting server-level fault tolerance in concurrent-push-based parallel video servers , 2001, IEEE Trans. Circuits Syst. Video Technol..
[25] Andrew S. Tanenbaum,et al. Dealing with Driver Failures in the Storage Stack , 2009, 2009 Fourth Latin-American Symposium on Dependable Computing.
[26] Zizhong Chen,et al. Highly Scalable Self-Healing Algorithms for High Performance Scientific Computing , 2009, IEEE Transactions on Computers.
[27] Jeffrey Dean,et al. Designs, Lessons and Advice from Building Large Distributed Systems , 2009 .
[28] Zhiling Lan,et al. 3-Dimensional root cause diagnosis via co-analysis , 2012, ICAC '12.
[29] Nagarajan Kandasamy,et al. On the application of predictive control techniques for adaptive performance management of computing systems , 2009, IEEE Transactions on Network and Service Management.
[30] Jing Deng,et al. Fault-tolerant and reliable computation in cloud computing , 2010, 2010 IEEE Globecom Workshops.