Distributed Systems Anomaly Detection Based on Log

Benefiting from the rapid development of information technology, distributed systems have been widely used. A distributed system consists of a large number of parts (nodes/components), so its maintenance usually requires plenty of manual work. To reduce the complexity and workload of the operation and maintenance of the complex system, more and more log anomaly detection methods are used for large-scale distributed systems. However, these methods do not consider the time and space characteristics of logs. To bridge this gap, we brought forward an anomaly detection method based on logs generated by distributed systems. We design a template parsing algorithm to parse logs through the Transformer encoder and two clusters of different granularities. We use an anomaly detection algorithm to capture anomalies in time and space through the combination of CNN, LSTM, and attention mechanism. In addition, we optimize the detection window by combining the session window with the sliding window, and we optimize the computational complexity by changing the connection between LSTM and CNN.