ChainSpot: Mining Service Logs for Cyber Security Threat Detection

Given service logs of who used what service, and when, how can we find intrusions and anomalies? In this paper, a cyber threat detection framework - ChainSpot was proposed, in which the novelty is to build graphical patterns by summarizing user's sequential behaviors of using application-layer services, and to discover deviations against one's normal patterns. Besides modeling, the issue of justifying trade-off between feature explicity and computation complexity is properly addressed, as well. Effectiveness and performance of proposed method are evaluated using dataset collected in real circumstance. Experiments show that ChainSpot can provide very good supports for awaring abnormal behaivors which is starting point of threat detection. The detection results are highly correlated to expert-labeled ground truth, therefore, ChainSpot is proven helpful for saving forensics efforts significantly. Even more, case investigations demonstrate that the differences between benign and suspicious patterns can be further interpreted to reconstruct the attack scenarios. Then the analytic findings may be treated as indicators of compromise for threat detection and in-depth clues for digital forensics.

[1]  Jian Zhou,et al.  Off-Line Handwritten Word Recognition Using a Hidden Markov Model Type Stochastic Network , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Christos Faloutsos,et al.  An Intrinsic Graphical Signature Based on Alert Correlation Analysis for Intrusion Detection , 2010, 2010 International Conference on Technologies and Applications of Artificial Intelligence.

[3]  Zhou Li,et al.  Detection of Early-Stage Enterprise Infection by Mining Large-Scale Log Data , 2014, 2015 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks.

[4]  Marc Shapiro,et al.  Structure and Encapsulation in Distributed Systems: The Proxy Principle , 1986, ICDCS.

[5]  Mark Russinovich,et al.  Microsoft Windows Internals, Fourth Edition: Microsoft Windows Server(TM) 2003, Windows XP, and Windows 2000 (Pro-Developer) , 2004 .

[6]  Robert C. Holte,et al.  Cost-Sensitive Classifier Evaluation Using Cost Curves , 2008, PAKDD.

[7]  William K. Robertson,et al.  Beehive: large-scale log analysis for detecting suspicious activity in enterprise networks , 2013, ACSAC.

[8]  Christopher Krügel,et al.  Nazca: Detecting Malware Distribution in Large-Scale Networks , 2014, NDSS.

[9]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[10]  Michael K. Reiter,et al.  An Epidemiological Study of Malware Encounters in a Large Enterprise , 2014, CCS.

[11]  Aaron F. Bobick,et al.  Parametric Hidden Markov Models for Gesture Recognition , 1999, IEEE Trans. Pattern Anal. Mach. Intell..