Towards Detecting Patterns in Failure Logs of Large-Scale Distributed Systems
暂无分享,去创建一个
[1] Jon Stearley,et al. What Supercomputers Say: A Study of Five System Logs , 2007, 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN'07).
[2] Miroslaw Malek,et al. Comprehensive logfiles for autonomic systems , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..
[3] Mark Crovella,et al. Mining anomalies using traffic feature distributions , 2005, SIGCOMM '05.
[4] Michal Aharon,et al. One Graph Is Worth a Thousand Logs: Uncovering Hidden Structures in Massive System Event Logs , 2009, ECML/PKDD.
[5] P. N. Suganthan,et al. Differential Evolution: A Survey of the State-of-the-Art , 2011, IEEE Transactions on Evolutionary Computation.
[6] Edward Chuah,et al. Diagnosing the root-causes of failures from cluster log files , 2010, 2010 International Conference on High Performance Computing.
[7] Ravishankar K. Iyer,et al. Automatic Recognition of Intermittent Failures: An Experimental Study of Field Data , 1990, IEEE Trans. Computers.
[8] Saharon Rosset,et al. Analyzing system logs: a new view of what's important , 2007 .
[9] Michael I. Jordan,et al. Detecting large-scale system problems by mining console logs , 2009, SOSP '09.
[10] Anand Sivasubramaniam,et al. Filtering failure logs for a BlueGene/L prototype , 2005, 2005 International Conference on Dependable Systems and Networks (DSN'05).
[11] Daniel P. Siewiorek,et al. Models for time coalescence in event logs , 1992, [1992] Digest of Papers. FTCS-22: The Twenty-Second International Symposium on Fault-Tolerant Computing.
[12] Edward Chuah,et al. Establishing Hypothesis for Recurrent System Failures from Cluster Log Files , 2011, 2011 IEEE Ninth International Conference on Dependable, Autonomic and Secure Computing.
[13] Jianfeng Zhan,et al. LogMaster: Mining Event Correlations in Logs of Large-Scale Cluster Systems , 2010, 2012 IEEE 31st Symposium on Reliable Distributed Systems.
[14] Tommy Minyard,et al. End-to-end framework for fault management for open source clusters: Ranger , 2010, TG.
[15] Ravishankar K. Iyer,et al. Recognition of Error Symptoms in Large Systems , 1986, FJCC.
[16] Franck Cappello,et al. Adaptive event prediction strategy with dynamic time window for large-scale HPC systems , 2011, SLAML '11.
[17] Carl E. Landwehr,et al. Basic concepts and taxonomy of dependable and secure computing , 2004, IEEE Transactions on Dependable and Secure Computing.
[18] Vladimir I. Levenshtein,et al. Binary codes capable of correcting deletions, insertions, and reversals , 1965 .
[19] Jon Stearley,et al. Bad Words: Finding Faults in Spirit's Syslogs , 2008, 2008 Eighth IEEE International Symposium on Cluster Computing and the Grid (CCGRID).
[20] Alexander Aiken,et al. Alert Detection in System Logs , 2008, 2008 Eighth IEEE International Conference on Data Mining.
[21] AvizienisAlgirdas,et al. Basic Concepts and Taxonomy of Dependable and Secure Computing , 2004 .
[22] Ling Huang,et al. Mining Console Logs for Large-Scale System Problem Detection , 2008, SysML.
[23] Elizabeth R. Jessup,et al. Matrices, Vector Spaces, and Information Retrieval , 1999, SIAM Rev..
[24] Robert L. Mercer,et al. Class-Based n-gram Models of Natural Language , 1992, CL.
[25] Anand Sivasubramaniam,et al. BlueGene/L Failure Analysis and Prediction Models , 2006, International Conference on Dependable Systems and Networks (DSN'06).
[26] Zhiling Lan,et al. System log pre-processing to improve failure prediction , 2009, 2009 IEEE/IFIP International Conference on Dependable Systems & Networks.
[27] Glenn A. Fink,et al. Predicting Computer System Failures Using Support Vector Machines , 2008, WASL.
[28] Franck Cappello,et al. Taming of the Shrew: Modeling the Normal and Faulty Behaviour of Large-scale HPC Systems , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.