HERCULE: attack story reconstruction via community discovery on correlated log graph

Advanced cyber attacks consist of multiple stages aimed at being stealthy and elusive. Such attack patterns leave their footprints spatio-temporally dispersed across many different logs in victim machines. However, existing log-mining intrusion analysis systems typically target only a single type of log to discover evidence of an attack and therefore fail to exploit fundamental inter-log connections. The output of such single-log analysis can hardly reveal the complete attack story for complex, multi-stage attacks. Additionally, some existing approaches require heavyweight system instrumentation, which makes them impractical to deploy in real production environments. To address these problems, we present HERCULE, an automated multi-stage log-based intrusion analysis system. Inspired by graph analytics research in social network analysis, we model multi-stage intrusion analysis as a community discovery problem. HERCULE builds multi-dimensional weighted graphs by correlating log entries across multiple lightweight logs that are readily available on commodity systems. From these, HERCULE discovers any "attack communities" embedded within the graphs. Our evaluation with 15 well known APT attack families demonstrates that HERCULE can reconstruct attack behaviors from a spectrum of cyber attacks that involve multiple stages with high accuracy and low false positive rates.

[1]  Christian Rossow,et al.  ProVeX: Detecting Botnets with Encrypted Command and Control Channels , 2013, DIMVA.

[2]  Naren Ramakrishnan,et al.  Detection of stealthy malware activities with traffic causality and scalable triggering relation discovery , 2014, AsiaCCS.

[3]  Zhenkai Liang,et al.  BitBlaze: A New Approach to Computer Security via Binary Analysis , 2008, ICISS.

[4]  Vinod Yegneswaran,et al.  BotHunter: Detecting Malware Infection Through IDS-Driven Dialog Correlation , 2007, USENIX Security Symposium.

[5]  Zhou Li,et al.  Detection of Early-Stage Enterprise Infection by Mining Large-Scale Log Data , 2014, 2015 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks.

[6]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[7]  Christopher Krügel,et al.  Effective and Efficient Malware Detection at the End Host , 2009, USENIX Security Symposium.

[8]  Xi Wang,et al.  Intrusion Recovery Using Selective Re-execution , 2010, OSDI.

[9]  Roberto Perdisci,et al.  From Throw-Away Traffic to Bots: Detecting the Rise of DGA-Based Malware , 2012, USENIX Security Symposium.

[10]  Mansour Ahmadi,et al.  DroidScribe: Classifying Android Malware Based on Runtime Behavior , 2016, 2016 IEEE Security and Privacy Workshops (SPW).

[11]  Stefano Zanero,et al.  Detecting Intrusions through System Call Sequence and Argument Analysis , 2010, IEEE Transactions on Dependable and Secure Computing.

[12]  Robin Sommer,et al.  Count Me In: Viable Distributed Summary Statistics for Securing High-Speed Networks , 2014, RAID.

[13]  Erland Jonsson,et al.  Using active learning in intrusion detection , 2004, Proceedings. 17th IEEE Computer Security Foundations Workshop, 2004..

[14]  Brian Neil Levine,et al.  Detecting the Sybil Attack in Mobile Ad hoc Networks , 2006, 2006 Securecomm and Workshops.

[15]  Matthew Smith,et al.  VCCFinder: Finding Potential Vulnerabilities in Open-Source Projects to Assist Code Audits , 2015, CCS.

[16]  Zhuoqing Morley Mao,et al.  Automated Classification and Analysis of Internet Malware , 2007, RAID.

[17]  Stephen McCamant,et al.  DTA++: Dynamic Taint Analysis with Targeted Control-Flow Propagation , 2011, NDSS.

[18]  Christopher Krügel,et al.  Revolver: An Automated Approach to the Detection of Evasive Web-based Malware , 2013, USENIX Security Symposium.

[19]  Zhendong Su,et al.  Temporal search: detecting hidden malware timebombs with virtual machines , 2006, ASPLOS XII.

[20]  Wenke Lee,et al.  McBoost: Boosting Scalability in Malware Collection and Analysis Using Statistical Classification of Executables , 2008, 2008 Annual Computer Security Applications Conference (ACSAC).

[21]  Abhinav Srivastava,et al.  Robust signatures for kernel data structures , 2009, CCS.

[22]  Xiangyu Zhang,et al.  High Accuracy Attack Provenance via Binary-based Execution Partition , 2013, NDSS.

[23]  Fei Peng,et al.  X-Force: Force-Executing Binary Programs for Security Applications , 2014, USENIX Security Symposium.

[24]  Christopher Krügel,et al.  Nexat: a history-based approach to predict attacker actions , 2011, ACSAC '11.

[25]  Xiangyu Zhang,et al.  IntroPerf: transparent context-sensitive multi-layer performance inference using system stack traces , 2014, SIGMETRICS '14.

[26]  Heng Yin,et al.  Panorama: capturing system-wide information flow for malware detection and analysis , 2007, CCS '07.

[27]  Wu-chi Feng,et al.  Automatic high-performance reconstruction and recovery , 2007, Comput. Networks.

[28]  Keith Marzullo,et al.  Analysis of Computer Intrusions Using Sequences of Function Calls , 2007, IEEE Transactions on Dependable and Secure Computing.

[29]  Xuxian Jiang,et al.  Provenance-Aware Tracing ofWorm Break-in and Contaminations: A Process Coloring Approach , 2006, 26th IEEE International Conference on Distributed Computing Systems (ICDCS'06).

[30]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[31]  James Newsome,et al.  Dynamic Taint Analysis for Automatic Detection, Analysis, and SignatureGeneration of Exploits on Commodity Software , 2005, NDSS.

[32]  Kevin Borders,et al.  Web tap: detecting covert web traffic , 2004, CCS '04.

[33]  Eric R. Ziegel,et al.  Generalized Linear Models , 2002, Technometrics.

[34]  Jong Kim,et al.  WarningBird: Detecting Suspicious URLs in Twitter Stream , 2012, NDSS.

[35]  Niels Provos,et al.  Ghost Turns Zombie: Exploring the Life Cycle of Web-based Malware , 2008, LEET.

[36]  Christopher Krügel,et al.  A quantitative study of accuracy in system call-based malware detection , 2012, ISSTA 2012.

[37]  U. Flegel,et al.  Detection of Intrusions and Malware & Vulnerability Assessment , 2004 .

[38]  Naren Ramakrishnan,et al.  Causality reasoning about network events for detecting stealthy malware activities , 2016, Comput. Secur..

[39]  Samuel T. King,et al.  Enriching Intrusion Alerts Through Multi-Host Causality , 2005, NDSS.

[40]  Insup Lee,et al.  Spam mitigation using spatio-temporal reputations from blacklist history , 2010, ACSAC '10.

[41]  Naren Ramakrishnan,et al.  Unearthing Stealthy Program Attacks Buried in Extremely Long Execution Paths , 2015, CCS.

[42]  Leyla Bilge,et al.  Disclosure: detecting botnet command and control servers through large-scale NetFlow analysis , 2012, ACSAC '12.

[43]  Xiangyu Zhang,et al.  ProTracer: Towards Practical Provenance Tracing by Alternating Between Logging and Tainting , 2016, NDSS.

[44]  Cristina L. Abad,et al.  Log correlation for intrusion detection: a proof of concept , 2003, 19th Annual Computer Security Applications Conference, 2003. Proceedings..

[45]  Wenke Lee,et al.  Detecting Malware Domains at the Upper DNS Hierarchy , 2011, USENIX Security Symposium.

[46]  Luo Si,et al.  LEAPS: Detecting Camouflaged Attacks with Statistical Learning Guided by Program Analysis , 2015, 2015 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks.