Diagnosing bot infections using Bayesian inference

Prior research in botnet detection has used the bot lifecycle to build detection systems. These systems, however, use rule-based decision engines which lack automated adaptability and learning, accuracy tunability, the ability to cope with gaps in training data, and the ability to incorporate local security policies. To counter these limitations, we propose to replace the rigid decision engines in contemporary bot detectors with a more formal Bayesian inference engine. Bottleneck, our prototype implementation, builds confidence in bot infections based on the causal bot lifecycle encoded in a Bayesian network. We evaluate Bottleneck by applying it as a post-processing decision engine on lifecycle events generated by two existing bot detectors (BotHunter and BotFlex) on two independently-collected datasets. Our experimental results show that Bottleneck consistently achieves comparable or better accuracy than the existing rule-based detectors when the test data is similar to the training data. For differing training and test data, Bottleneck, due to its automated learning and inference models, easily surpasses the accuracies of rule-based systems. Moreover, Bottleneck’s stochastic nature allows its accuracy to be tuned with respect to organizational needs. Extending Bottleneck’s Bayesian network into an influence diagram allows for local security policies to be defined within our framework. Lastly, we show that Bottleneck can also be extended to incorporate evidence trustscore for false alarm reduction.

[1]  Thorsten Holz,et al.  Rishi: Identify Bot Contaminated Hosts by IRC Nickname Evaluation , 2007, HotBots.

[2]  Chun-Ying Huang,et al.  Fast-Flux Bot Detection in Real Time , 2010, RAID.

[3]  Finn V. Jensen,et al.  Bayesian Networks and Decision Graphs , 2001, Statistics for Engineering and Information Science.

[4]  Guofei Gu,et al.  BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure-Independent Botnet Detection , 2008, USENIX Security Symposium.

[5]  Christopher Krügel,et al.  Revolver: An Automated Approach to the Detection of Evasive Web-based Malware , 2013, USENIX Security Symposium.

[6]  Syed Ali Khayam,et al.  BotFlex: A community-driven tool for botnet detection , 2015, J. Netw. Comput. Appl..

[7]  Alex A. Freitas,et al.  A review of performance evaluation measures for hierarchical classifiers , 2007 .

[8]  A. Hasman,et al.  Probabilistic reasoning in intelligent systems: Networks of plausible inference , 1991 .

[9]  Jian Pei,et al.  2012- Data Mining. Concepts and Techniques, 3rd Edition.pdf , 2012 .

[10]  Chen Lu,et al.  Botnet traffic detection using hidden Markov models , 2011, CSIIRW '11.

[11]  L Poole David,et al.  Artificial Intelligence: Foundations of Computational Agents , 2010 .

[12]  Christopher Krügel,et al.  Nazca: Detecting Malware Distribution in Large-Scale Networks , 2014, NDSS.

[13]  Yao Zhao,et al.  BotGraph: Large Scale Spamming Botnet Detection , 2009, NSDI.

[14]  David J. Spiegelhalter,et al.  Bayesian analysis in expert systems , 1993 .

[15]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[16]  Gustavo Gonzalez Granadillo,et al.  Botnets: Lifecycle and Taxonomy , 2011, 2011 Conference on Network and Information Systems Security.

[17]  Kwan-Liu Ma,et al.  Interactive Visualization for Network and Port Scan Detection , 2005, RAID.

[18]  Ronaldo M. Salles,et al.  Botnets: A survey , 2013, Comput. Networks.

[19]  Ali A. Ghorbani,et al.  Detecting P2P botnets through network behavior analysis and machine learning , 2011, 2011 Ninth Annual International Conference on Privacy, Security and Trust.

[20]  Gregory F. Cooper,et al.  A Bayesian Method for the Induction of Probabilistic Networks from Data , 1992 .

[21]  Andreas Terzis,et al.  A multifaceted approach to understanding the botnet phenomenon , 2006, IMC '06.

[22]  Eric van den Berg,et al.  A Fast Static Analysis Approach to Detect Exploit Code Inside Network Flows , 2005, RAID.

[23]  Syed Ali Khayam,et al.  Poster : Bottleneck : A Generalized , Flexible , and Extensible Framework for Botnet Defense , 2012 .

[24]  Roberto Perdisci,et al.  ExecScent: Mining for New C&C Domains in Live Networks with Adaptive Control Protocol Templates , 2013, USENIX Security Symposium.

[25]  Syed Ali Khayam,et al.  A Taxonomy of Botnet Behavior, Detection, and Defense , 2014, IEEE Communications Surveys & Tutorials.

[26]  Giuseppe Antonio Di Luna,et al.  Collaborative Detection of Coordinated Port Scans , 2013, ICDCN.

[27]  Roberto Perdisci,et al.  Scalable fine-grained behavioral clustering of HTTP-based malware , 2013, Comput. Networks.

[28]  Vinod Yegneswaran,et al.  BotHunter: Detecting Malware Infection Through IDS-Driven Dialog Correlation , 2007, USENIX Security Symposium.

[29]  Martin Roesch,et al.  Snort - Lightweight Intrusion Detection for Networks , 1999 .

[30]  Dan Schnackenberg,et al.  Statistical approaches to DDoS attack detection and response , 2003, Proceedings DARPA Information Survivability Conference and Exposition.

[31]  R. Suganya,et al.  Data Mining Concepts and Techniques , 2010 .

[32]  Saman Taghavi Zargar,et al.  A Survey of Defense Mechanisms Against Distributed Denial of Service (DDoS) Flooding Attacks , 2013, IEEE Communications Surveys & Tutorials.

[33]  Vern Paxson,et al.  Outside the Closed World: On Using Machine Learning for Network Intrusion Detection , 2010, 2010 IEEE Symposium on Security and Privacy.

[34]  Russell Greiner,et al.  Learning Bayesian Belief Network Classifiers: Algorithms and System , 2001, Canadian Conference on AI.

[35]  Alan K. Mackworth,et al.  Artificial Intelligence - Foundations of Computational Agents , 2010 .

[36]  Roberto Perdisci,et al.  From Throw-Away Traffic to Bots: Detecting the Rise of DGA-Based Malware , 2012, USENIX Security Symposium.

[37]  Syed Ali Khayam,et al.  POSTER: BotFlex: a community-driven tool for botnetdetection , 2013, CCS.

[38]  Levente Buttyán,et al.  Duqu: Analysis, Detection, and Lessons Learned , 2012 .

[39]  Christopher Krügel,et al.  Bayesian event classification for intrusion detection , 2003, 19th Annual Computer Security Applications Conference, 2003. Proceedings..