BotMark: Automated botnet detection with hybrid analysis of flow-based and graph-based traffic behaviors

Abstract The Botnets have become one of the most serious threats to cyber infrastructure. Most existing work on detecting botnets is based on flow-based traffic analysis by mining their communication patterns. There also exists related work based on anomaly detection in communication graphs. As bots have continuously evolved and become increasingly sophisticated, only using flow-based traffic analysis or graph-based analysis for the detection would result in false negatives or false positives, or can even be evaded. In this work, we propose BotMark, an automated model that detects botnets with hybrid analysis of flow-based and graph-based network traffic behaviors. We extract 15 statistical flow-based traffic features as well as 3 graph-based features in building the detection model. For flow-based detection, we consider the similarity and stability of C-flow as measurements in the detection. In particular, we employ k-means to measure the similarity of C-flows and assign similarity scores, and calculate stability score of C-flows through the distribution of packet length within a C-flow. The graph-based detection is based on the observation that the neighborhoods of anomalous nodes significantly differ from those of normal nodes in communication graphs. In particular, we use least-square technique and Local Outlier Factor (LOF) to calculate anomaly scores that measure the differences of their neighborhoods. Our models use the scores to mark bots. BotMark performs automated botnet detection with hybrid analysis of flow-based and graph-based traffic behaviors by ensemble of the detection results based on similarity scores, stability scores and anomaly scores. We collect a very large size of network traffic by simulating 5 newly propagated botnets, including Mirai, Black energy, Zeus, Athena and Ares in a real computing environment. Extensive experimental results demonstrate the effectiveness of BotMark. It achieves 99.94% in terms of detection accuracy, outperforming any individual detector with flow-based detection or graph-based detection.

[1]  Heejo Lee,et al.  Botnet Detection by Monitoring Group Activities in DNS Traffic , 2007, 7th IEEE International Conference on Computer and Information Technology (CIT 2007).

[2]  Ali A. Ghorbani,et al.  Towards effective feature selection in machine learning-based botnet detection approaches , 2014, 2014 IEEE Conference on Communications and Network Security.

[3]  Xiangliang Zhang,et al.  Autonomic intrusion detection: Adaptively detecting anomalies over unlabeled audit data streams in computer networks , 2014, Knowl. Based Syst..

[4]  Xiangliang Zhang,et al.  Profiling program behavior for anomaly intrusion detection based on the transition and frequency property of computer audit data , 2006, Comput. Secur..

[5]  Danai Koutra,et al.  Graph based anomaly detection and description: a survey , 2014, Data Mining and Knowledge Discovery.

[6]  Xiangliang Zhang,et al.  Characterizing Android apps' behavior for effective detection of malapps at large scale , 2017, Future Gener. Comput. Syst..

[7]  Nareli Cruz Cortés,et al.  Feature selection to detect botnets using machine learning algorithms , 2017, 2017 International Conference on Electronics, Communications and Computers (CONIELECOMP).

[8]  Radu State,et al.  BotGM: Unsupervised graph mining to detect botnets in traffic flows , 2017, 2017 1st Cyber Security in Networking Conference (CSNet).

[9]  Sencun Zhu,et al.  Privacy Risk Analysis and Mitigation of Analytics Libraries in the Android Ecosystem , 2020, IEEE Transactions on Mobile Computing.

[10]  Jiqiang Liu,et al.  Constructing important features from massive network traffic for lightweight intrusion detection , 2015, IET Inf. Secur..

[11]  Sharath Chandra Guntuku,et al.  Big Data Analytics framework for Peer-to-Peer Botnet detection using Random Forests , 2014, Inf. Sci..

[12]  Christopher Krügel,et al.  BotFinder: finding bots in network traffic without deep packet inspection , 2012, CoNEXT '12.

[13]  Prateek Mittal,et al.  BotGrep: Finding P2P Bots with Structured Graph Analysis , 2010, USENIX Security Symposium.

[14]  Xiangliang Zhang,et al.  Abstracting massive data for lightweight intrusion detection in computer networks , 2016, Inf. Sci..

[15]  Wei Wang,et al.  Effective android malware detection with a hybrid model based on deep autoencoder and convolutional neural network , 2018, Journal of Ambient Intelligence and Humanized Computing.

[16]  Guofei Gu,et al.  BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure-Independent Botnet Detection , 2008, USENIX Security Symposium.

[17]  Ali A. Ghorbani,et al.  Botnet detection based on traffic behavior analysis and flow intervals , 2013, Comput. Secur..

[18]  Xiangliang Zhang,et al.  Detecting Android malicious apps and categorizing benign apps with ensemble of classifiers , 2018, Future Gener. Comput. Syst..

[19]  Nizar Kheir,et al.  BotSuer: Suing Stealthy P2P Bots in Network Traffic through Netflow Analysis , 2013, CANS.

[20]  R. Anitha,et al.  Botnet detection via mining of traffic flow characteristics , 2016, Comput. Electr. Eng..

[21]  Jing Wang,et al.  Botnet detection using social graph analysis , 2014, 2014 52nd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[22]  Ali A. Ghorbani,et al.  Detecting P2P botnets through network behavior analysis and machine learning , 2011, 2011 Ninth Annual International Conference on Privacy, Security and Trust.

[23]  George Varghese,et al.  Graption: A graph-based P2P traffic classification framework for the internet backbone , 2011, Comput. Networks.

[24]  Ge Yu,et al.  Data-Adaptive Clustering Analysis for Online Botnet Detection , 2010, 2010 Third International Joint Conference on Computational Science and Optimization.

[25]  Xiaowei Xu,et al.  SCAN: a structural clustering algorithm for networks , 2007, KDD '07.

[26]  Wei Wang,et al.  Botnet Detection with Hybrid Analysis on Flow Based and Graph Based Features of Network Traffic , 2018, ICCCS.

[27]  Radu State,et al.  BotTrack: Tracking Botnets Using NetFlow and PageRank , 2011, Networking.

[28]  Xiangliang Zhang,et al.  Constructing attribute weights from computer audit data for effective intrusion detection , 2009, J. Syst. Softw..

[29]  Zhaoxin Zhang,et al.  A Novel Approach to Detect IRC-Based Botnets , 2009, 2009 International Conference on Networks Security, Wireless Communications and Trusted Computing.

[30]  W. Timothy Strayer,et al.  Using Machine Learning Techniques to Identify Botnet Traffic , 2006 .

[31]  Mohammad Marufuzzaman,et al.  Botnet detection using graph-based feature clustering , 2017, Journal of Big Data.

[32]  Xiangliang Zhang,et al.  Processing of massive audit data streams for real-time anomaly intrusion detection , 2008, Comput. Commun..

[33]  Christos Faloutsos,et al.  oddball: Spotting Anomalies in Weighted Graphs , 2010, PAKDD.

[34]  Xiangliang Zhang,et al.  CreditCoin: A Privacy-Preserving Blockchain-Based Incentive Announcement Network for Communications of Smart Vehicles , 2018, IEEE Transactions on Intelligent Transportation Systems.

[35]  Xiangliang Zhang,et al.  Exploring Permission-Induced Risk in Android Applications for Malicious Application Detection , 2014, IEEE Transactions on Information Forensics and Security.

[36]  Richi Nayak,et al.  Analyzing the Effectiveness of Graph Metrics for Anomaly Detection in Online Social Networks , 2012, WISE.

[37]  Ramesh Singh Rawat,et al.  Survey of Peer-to-Peer Botnets and Detection Frameworks , 2018, Int. J. Netw. Secur..

[38]  Xiangliang Zhang,et al.  Constructing Features for Detecting Android Malicious Applications: Issues, Taxonomy and Directions , 2019, IEEE Access.

[39]  Yang Wang,et al.  BotCapturer: Detecting Botnets based on Two-Layered Analysis with Graph Anomaly Detection and Network Traffic Clustering , 2018 .

[40]  Wen-Hwa Liao,et al.  Peer to Peer Botnet Detection Using Data Mining Scheme , 2010, 2010 International Conference on Internet Technology and Applications.

[41]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.