Towards intelligent incident management: why we need it and how we make it
暂无分享,去创建一个
Yu Kang | Pu Zhao | Qingwei Lin | Bo Qiao | Zhuangbin Chen | Qingwei Lin | Pu Zhao | Bo Qiao | Yu Kang | Zhuangbin Chen
[1] Yu Zhang,et al. Log Clustering Based Problem Identification for Online Service Systems , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering Companion (ICSE-C).
[2] Yoon Kim,et al. Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.
[3] Haryadi S. Gunawi,et al. Why Does the Cloud Stop Computing?: Lessons from Hundreds of Service Outages , 2016, SoCC.
[4] Paramvir Bahl,et al. Discovering Dependencies for Network Management , 2006, HotNets.
[5] Peng Huang,et al. Gray Failure: The Achilles' Heel of Cloud-Scale Systems , 2017, HotOS.
[6] Xu Zhang,et al. Robust log-based anomaly detection on unstable log data , 2019, ESEC/SIGSOFT FSE.
[7] David A. Patterson,et al. Path-Based Failure and Evolution Management , 2004, NSDI.
[8] Yangfan Zhou,et al. iFeedback: Exploiting User Feedback for Real-Time Issue Detection in Large-Scale Online Service Systems , 2019, 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE).
[9] Dongmei Zhang,et al. An Empirical Investigation of Incident Triage for Online Service Systems , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP).
[10] Shilin He,et al. Characterizing the Natural Language Descriptions in Software Logging Statements , 2018, 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).
[11] Harry Wechsler,et al. A Martingale Framework for Detecting Changes in Data Streams by Testing Exchangeability , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[12] Dongmei Zhang,et al. iDice: Problem Identification for Emerging Issues , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).
[13] Hang Dong,et al. Outage Prediction and Diagnosis for Cloud Service Systems , 2019, WWW.
[14] Chan-Gun Lee,et al. Applying deep learning based automatic bug triager to industrial projects , 2017, ESEC/SIGSOFT FSE.
[15] Hao Hu,et al. Effective Bug Triage Based on Historical Bug-Fix Information , 2014, 2014 IEEE 25th International Symposium on Software Reliability Engineering.
[16] Richard Mortier,et al. Using Magpie for Request Extraction and Workload Modelling , 2004, OSDI.
[17] Peng Huang,et al. AIOps: Real-World Challenges and Research Innovations , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering: Companion Proceedings (ICSE-Companion).
[18] Fabio Casati,et al. Toward Web Service Dependency Discovery for SOA Management , 2008, 2008 IEEE International Conference on Services Computing.
[19] Haoxiang Lin,et al. An Empirical Study on Quality Issues of Production Big Data Platform , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.
[20] Qiang Fu,et al. Software analytics for incident management of online services: An experience report , 2013, 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE).
[21] Zhaohui Wu,et al. CloudScout: A Non-Intrusive Approach to Service Dependency Discovery , 2017, IEEE Transactions on Parallel and Distributed Systems.
[22] Tong Zhang,et al. Deep Pyramid Convolutional Neural Networks for Text Categorization , 2017, ACL.
[23] Junjie Chen,et al. Continuous Incident Triage for Large-Scale Online Service Systems , 2019, 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE).
[24] Qiang Fu,et al. Mining dependency in distributed systems through unstructured logs analysis , 2010, OPSR.
[25] Liang Gong,et al. Predicting bug-fixing time: An empirical study of commercial software projects , 2013, 2013 35th International Conference on Software Engineering (ICSE).
[26] Dongmei Zhang,et al. Predicting Node failure in cloud service systems , 2018, ESEC/SIGSOFT FSE.
[27] Xu Zhang,et al. Cross-dataset Time Series Anomaly Detection for Cloud Systems , 2019, USENIX Annual Technical Conference.
[28] Xu Chen,et al. Automating Network Application Dependency Discovery: Experiences, Limitations, and New Solutions , 2008, OSDI.
[29] Qiang Fu,et al. Mining Historical Issue Repositories to Heal Large-Scale Online Service Systems , 2014, 2014 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks.
[30] Shilin He,et al. Experience Report: System Log Analysis for Anomaly Detection , 2016, 2016 IEEE 27th International Symposium on Software Reliability Engineering (ISSRE).
[31] Dongmei Zhang,et al. Identifying impactful service system problems via log analysis , 2018, ESEC/SIGSOFT FSE.
[32] Domenico Cotroneo,et al. What Logs Should You Look at When an Application Fails? Insights from an Industrial Case Study , 2014, 2014 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks.
[33] Teodor-Florin Fortis,et al. Cloud Incident Management, Challenges, Research Directions, and Architectural Approach , 2014, 2014 IEEE/ACM 7th International Conference on Utility and Cloud Computing.
[34] Xin Peng,et al. A learning-based approach for automatic construction of domain glossary from source code and documentation , 2019, ESEC/SIGSOFT FSE.
[35] Aaron B. Brown,et al. An active approach to characterizing dynamic dependencies for problem determination in a distributed environment , 2001, 2001 IEEE/IFIP International Symposium on Integrated Network Management Proceedings. Integrated Network Management VII. Integrated Management Strategies for the New Millennium (Cat. No.01EX470).
[36] Peng Li,et al. Improving Service Availability of Cloud Systems by Predicting Disk Error , 2018, USENIX ATC.
[37] Alexander Gammerman,et al. Plug-in martingales for testing exchangeability on-line , 2012, ICML.
[38] Qiang Fu,et al. Execution Anomaly Detection in Distributed Systems through Unstructured Log Analysis , 2009, 2009 Ninth IEEE International Conference on Data Mining.