Automatic discovery of botnet communities on large-scale communication networks

Botnets are networks of compromised computers infected with malicious code that can be controlled remotely under a common command and control (C&C) channel. Recognized as one the most serious security threats on current Internet infrastructure, advanced botnets are hidden not only in existing well known network applications (e.g. IRC, HTTP, or Peer-to-Peer) but also in some unknown or novel (creative) applications, which makes the botnet detection a challenging problem. Most current attempts for detecting botnets are to examine traffic content for bot signatures on selected network links or by setting up honeypots. In this paper, we propose a new hierarchical framework to automatically discover botnets on a large-scale WiFi ISP network, in which we first classify the network traffic into different application communities by using payload signatures and a novel cross-association clustering algorithm, and then on each obtained application community, we analyze the temporal-frequent characteristics of flows that lead to the differentiation of malicious channels created by bots from normal traffic generated by human beings. We evaluate our approach with about 100 million flows collected over three consecutive days on a large-scale WiFi ISP network and results show the proposed approach successfully detects two types of botnet application flows (i.e. Blackenergy HTTP bot and Kaiten IRC bot) from about 100 million flows with a high detection rate and an acceptable low false alarm rate.

[1]  Hamza Dahmouni,et al.  A markovian signature-based approach to IP traffic classification , 2007, MineNet '07.

[2]  Mitsuaki Akiyama,et al.  A Proposal of Metrics for Botnet Detection Based on Its Cooperative Behavior , 2007, 2007 International Symposium on Applications and the Internet Workshops.

[3]  Konstantina Papagiannaki,et al.  Toward the Accurate Identification of Network Applications , 2005, PAM.

[4]  Suresh Singh,et al.  An Algorithm for Anomaly-based Botnet Detection , 2006, SRUTI.

[5]  Ryan Cunningham,et al.  Honeypot-Aware Advanced Botnet Construction and Maintenance , 2006, International Conference on Dependable Systems and Networks (DSN'06).

[6]  Sotiris Ioannidis,et al.  Antisocial Networks: Turning a Social Network into a Botnet , 2008, ISC.

[7]  Thorsten Holz,et al.  Rishi: Identify Bot Contaminated Hosts by IRC Nickname Evaluation , 2007, HotBots.

[8]  Felix C. Freiling,et al.  The Nepenthes Platform: An Efficient Approach to Collect Malware , 2006, RAID.

[9]  Guofei Gu,et al.  BotSniffer: Detecting Botnet Command and Control Channels in Network Traffic , 2008, NDSS.

[10]  Felix C. Freiling,et al.  Botnet Tracking: Exploring a Root-Cause Methodology to Prevent Distributed Denial-of-Service Attacks , 2005, ESORICS.

[11]  Dawn Song,et al.  Malware Detection (Advances in Information Security) , 2006 .

[12]  Sebastian Zander,et al.  A preliminary performance comparison of five machine learning algorithms for practical IP traffic flow classification , 2006, CCRV.

[13]  Brian Rexroad,et al.  Wide-Scale Botnet Detection and Characterization , 2007, HotBots.

[14]  Luca Salgarelli,et al.  Comparing traffic classifiers , 2007, CCRV.

[15]  Sebastian Zander,et al.  Automated traffic classification and application identification using machine learning , 2005, The IEEE Conference on Local Computer Networks 30th Anniversary (LCN'05)l.

[16]  Ping Wang,et al.  An Advanced Hybrid Peer-to-Peer Botnet , 2007, IEEE Transactions on Dependable and Secure Computing.

[17]  Guofei Gu,et al.  BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure-Independent Botnet Detection , 2008, USENIX Security Symposium.

[18]  Yan Chen,et al.  Honeynet-based Botnet Scan Traffic Analysis , 2008, Botnet Detection.

[19]  Renata Teixeira,et al.  Early application identification , 2006, CoNEXT '06.

[20]  W. Timothy Strayer,et al.  Botnet Detection Based on Network Behavior , 2008, Botnet Detection.

[21]  W. Timothy Strayer,et al.  Detecting Botnets with Tight Command and Control , 2006, Proceedings. 2006 31st IEEE Conference on Local Computer Networks.

[22]  Vinod Yegneswaran,et al.  BotHunter: Detecting Malware Infection Through IDS-Driven Dialog Correlation , 2007, USENIX Security Symposium.

[23]  Steven Michael Bellovin Proceedings of the 3rd USENIX workshop on Steps to reducing unwanted traffic on the internet , 2007 .

[24]  Salvatore J. Stolfo,et al.  Anomalous Payload-Based Worm Detection and Signature Generation , 2005, RAID.

[25]  Christos Faloutsos,et al.  Fully automatic cross-associations , 2004, KDD.

[26]  Wenke Lee,et al.  Botnet Detection: Countering the Largest Security Threat , 2010, Botnet Detection.

[27]  Michalis Faloutsos,et al.  BLINC: multilevel traffic classification in the dark , 2005, SIGCOMM '05.

[28]  Vinod Yegneswaran,et al.  Using Honeynets for Internet Situational Awareness , 2005 .

[29]  Andrew W. Moore,et al.  Internet traffic classification using bayesian analysis techniques , 2005, SIGMETRICS '05.

[30]  Matthew Roughan,et al.  Class-of-service mapping for QoS: a statistical signature-based approach to IP traffic classification , 2004, IMC '04.

[31]  Andreas Terzis,et al.  A multifaceted approach to understanding the botnet phenomenon , 2006, IMC '06.

[32]  Vinod Yegneswaran,et al.  An Inside Look at Botnets , 2007, Malware Detection.

[33]  James Won-Ki Hong,et al.  Towards automated application signature generation for traffic identification , 2008, NOMS 2008 - 2008 IEEE Network Operations and Management Symposium.

[34]  Felix C. Freiling,et al.  Measurements and Mitigation of Peer-to-Peer-based Botnets: A Case Study on Storm Worm , 2008, LEET.

[35]  Renata Teixeira,et al.  Traffic classification on the fly , 2006, CCRV.

[36]  Anthony McGregor,et al.  Flow Clustering Using Machine Learning Techniques , 2004, PAM.

[37]  Maurizio Dusi,et al.  Traffic classification through simple statistical fingerprinting , 2007, CCRV.

[38]  Eleazar Eskin,et al.  Anomaly Detection over Noisy Data using Learned Probability Distributions , 2000, ICML.

[39]  Salvatore J. Stolfo,et al.  Anomalous Payload-Based Network Intrusion Detection , 2004, RAID.

[40]  W. Timothy Strayer,et al.  Using Machine Learning Techniques to Identify Botnet Traffic , 2006 .