Behavioral anomaly detection approach based on log monitoring

Log monitoring has been an effective measure to detect anomalies in large-scale software systems. Many researches for anomaly detection are based on the analysis of log semantics or frequency features in a single time interval. In this paper, we present a new detection method which predicts the system state by detecting anomalous behaviors extracted from log messages. Our detection method consists of 2 major steps: First, preprocess log messages by log normalization and an efficient hierarchical clustering operation. Second, generate behavior pattern sets from clustered messages and assign an anomaly score to new log sequences according to the relation between the log sequences and corresponding behavior patterns. Experiments on real world log data show that our method can predict system anomalies with a high accuracy.

[1]  Wei Zhou,et al.  PADM: Page Rank-Based Anomaly Detection Method of Log Sequences by Graph Computing , 2014, 2014 IEEE 6th International Conference on Cloud Computing Technology and Science.

[2]  Risto Vaarandi,et al.  A data clustering algorithm for mining patterns from event logs , 2003, Proceedings of the 3rd IEEE Workshop on IP Operations & Management (IPOM 2003) (IEEE Cat. No.03EX764).

[3]  Felix Salfner,et al.  Error Log Processing for Accurate Failure Prediction , 2008, WASL.

[4]  Zhiling Lan,et al.  System log pre-processing to improve failure prediction , 2009, 2009 IEEE/IFIP International Conference on Dependable Systems & Networks.

[5]  Vipin Kumar,et al.  Anomaly Detection for Discrete Sequences: A Survey , 2012, IEEE Transactions on Knowledge and Data Engineering.

[6]  Navjot Singh,et al.  A log mining approach to failure analysis of enterprise telephony systems , 2008, 2008 IEEE International Conference on Dependable Systems and Networks With FTCS and DCC (DSN).

[7]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[8]  Jian Cao,et al.  A Similarity Network Based Behavior Anomaly Detection Model for Computer Systems , 2014, 2014 IEEE 17th International Conference on Computational Science and Engineering.

[9]  Jon Stearley,et al.  What Supercomputers Say: A Study of Five System Logs , 2007, 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN'07).

[10]  Dan Gusfield,et al.  Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[11]  Navjot Singh,et al.  Log Analytics for Dependable Enterprise Telephony , 2012, 2012 Ninth European Dependable Computing Conference.

[12]  Rui Xu,et al.  Survey of clustering algorithms , 2005, IEEE Transactions on Neural Networks.

[13]  Kenji Yamanishi,et al.  Dynamic syslog mining for network failure monitoring , 2005, KDD '05.

[14]  Eamonn J. Keogh,et al.  Finding the most unusual time series subsequence: algorithms and applications , 2006, Knowledge and Information Systems.

[15]  William K. Robertson,et al.  Beehive: large-scale log analysis for detecting suspicious activity in enterprise networks , 2013, ACSAC.

[16]  Michael I. Jordan,et al.  Detecting large-scale system problems by mining console logs , 2009, SOSP '09.

[17]  Timo Hämäläinen,et al.  An Efficient Network Log Anomaly Detection System Using Random Projection Dimensionality Reduction , 2014, 2014 6th International Conference on New Technologies, Mobility and Security (NTMS).

[18]  Alberto Sillitti,et al.  Failure prediction based on log files using Random Indexing and Support Vector Machines , 2013, J. Syst. Softw..

[19]  Anand Sivasubramaniam,et al.  Critical event prediction for proactive management in large-scale computer clusters , 2003, KDD '03.