论文信息 - Anomaly Detection and Levels of Automation for AI-Supported System Administration

Anomaly Detection and Levels of Automation for AI-Supported System Administration

Artificial Intelligence for IT Operations (AIOps) describes the process of maintaining and operating large IT infrastructures using AI-supported methods and tools on different levels. This includes automated anomaly detection and root cause analysis, remediation and optimization, as well as fully automated initiation of self-stabilizing activities. While the automation is mandatory due to the system complexity and the criticality of QoS-bounded responses, the measures compiled and deployed by the AI-controlled administration are not easily understandable or reproducible in all cases. Therefore, explainable actions taken by the automated systems are becoming a regulatory requirement for future IT infrastructures. In this paper we present a developed and deployed system named ZerOps as an example for the design of the corresponding architecture, tools, and methods. This system uses deep learning models and data analytics of monitoring data to detect and remediate anomalies.

Florian Schmidt | Odej Kao | Anton Gulenko

[1] Thomas B. Sheridan,et al. Human and Computer Control of Undersea Teleoperators , 1978 .

[2] Feng Liu,et al. Evaluating machine learning algorithms for anomaly detection in clouds , 2016, 2016 IEEE International Conference on Big Data (Big Data).

[3] Jeffrey O. Kephart,et al. The Vision of Autonomic Computing , 2003, Computer.

[4] Feng Liu,et al. IFTM - Unsupervised Anomaly Detection for Virtualized Network Function Services , 2018, 2018 IEEE International Conference on Web Services (ICWS).

[5] Thomas A. Corbi,et al. The dawning of the autonomic computing era , 2003, IBM Syst. J..

[6] M R Endsley,et al. Level of automation effects on performance, situation awareness and workload in a dynamic control task. , 1999, Ergonomics.

[7] Feng Liu,et al. A System Architecture for Real-time Anomaly Detection in Large-scale NFV Systems , 2016, FNC/MobiSPC.

[8] Mica R. Endsley,et al. The Application of Human Factors to the Development of Expert Systems for Advanced Cockpits , 1987 .

[9] Debanjan Ghosh,et al. Self-healing systems - survey and synthesis , 2007, Decis. Support Syst..

[10] Valentino Constantinou,et al. Detecting Spacecraft Anomalies Using LSTMs and Nonparametric Dynamic Thresholding , 2018, KDD.