Employing machine learning algorithms to detect unknown scanning and email worms

Abstract: We present a worm detection system that leverages the reliability of IP-Flow and the effectiveness of learning machines. Typically, a host infected by a scanning or an email worm initiates a significant amount of traffic that does not rely on DNS to translate names into numeric IP addresses. Based on this fact, we capture and classify NetFlow records to extract feature patterns for each PC on the network within a certain period of time. A feature pattern includes: No of DNS requests, no of DNS responses, no of DNS normals, and no of DNS anomalies. Two learning machines are used, K-Nearest Neighbors (KNN) and Naive Bayes (NB), for the purpose of classification. Solid statistical tests, the cross-validation and paired t-test, are conducted to compare the individual performance between the KNN and NB algorithms. We used the classification accuracy, false alarm rates, and training time as metrics of performance to conclude which algorithm is superior to another. The data set used in training and testing the algorithms is created by using 18 real-life worm variants along with a big amount of benign flows.

[1]  Daniel Q. Naiman Statistical anomaly detection via httpd data analysis , 2004, Comput. Stat. Data Anal..

[2]  Bhavani M. Thuraisingham,et al.  E-Mail Worm Detection Using Data Mining , 2007, Int. J. Inf. Secur. Priv..

[3]  Philip S. Yu,et al.  Top 10 algorithms in data mining , 2007, Knowledge and Information Systems.

[4]  Benoit Claise,et al.  Cisco Systems NetFlow Services Export Version 9 , 2004, RFC.

[5]  Mohammad M. Rasheed,et al.  Intelligent Failure Connection Algorithm for Detecting Internet Worms , 2009 .

[6]  Sulaiman Mohd Nor,et al.  Detecting Worms Using Data Mining Techniques: Learning in the Presence of Class Noise , 2010, 2010 Sixth International Conference on Signal-Image Technology and Internet Based Systems.

[7]  Keisuke Ishibashi,et al.  Detecting mass-mailing worm infected hosts by mining DNS traffic data , 2005, MineNet '05.

[8]  Y. Musashi,et al.  Indirect Detection of Mass Mailing Worm-Infected PC terminals for Learners , 2004 .

[9]  Chin-Tser Huang,et al.  Wavelet-based Real Time Detection of Network Traffic Anomalies , 2006, 2006 Securecomm and Workshops.

[10]  Wenke Lee,et al.  Botnet Detection: Countering the Largest Security Threat , 2010, Botnet Detection.

[11]  Kang G. Shin,et al.  Containment of network worms via per-process rate-limiting , 2008, SecureComm.

[12]  Marcus A. Maloof,et al.  Learning to detect malicious executables in the wild , 2004, KDD.

[13]  Yong Tang,et al.  Concept, Characteristics and Defending Mechanism of Worms , 2009, IEICE Trans. Inf. Syst..

[14]  송왕철,et al.  IDS(Intrusion Detection System) , 2000 .

[15]  Donald F. Towsley,et al.  Code red worm propagation modeling and analysis , 2002, CCS '02.

[16]  Alaa El-Halees Filtering spam e-mail from mixed arabic and english messages: a comparison of machine learning techniques , 2009, Int. Arab J. Inf. Technol..

[17]  Kai Rannenberg,et al.  Detection of Mass Mailing Worm-infected PC terminals by Observing DNS Query Access , 2004 .

[18]  Peter Phaal,et al.  InMon Corporation's sFlow: A Method for Monitoring Traffic in Switched and Routed Networks , 2001, RFC.

[19]  Hassen Saïdi,et al.  A Foray into Conficker's Logic and Rendezvous Points , 2009, LEET.

[20]  Stefan Savage,et al.  Inside the Slammer Worm , 2003, IEEE Secur. Priv..

[21]  Hayder Radha,et al.  Worm Detection at Network Endpoints Using Information-Theoretic Traffic Perturbations , 2008, 2008 IEEE International Conference on Communications.

[22]  M. Siddiqui,et al.  Detecting Internet Worms Using Data Mining Techniques , 2008 .

[23]  Bhavani Thuraisingham,et al.  Email Worm Detection Using Data Mining , 2016 .

[24]  Evangelos Kranakis,et al.  DNS-based Detection of Scanning Worms in an Enterprise Network , 2005, NDSS.

[25]  Aiko Pras,et al.  An Overview of IP Flow-Based Intrusion Detection , 2010, IEEE Communications Surveys & Tutorials.

[26]  A. Youssef,et al.  An implementation for a worm detection and mitigation system , 2008, 2008 24th Biennial Symposium on Communications.

[27]  Salvatore J. Stolfo,et al.  Data mining methods for detection of new malicious executables , 2001, Proceedings 2001 IEEE Symposium on Security and Privacy. S&P 2001.

[28]  Pele Li,et al.  A survey of internet worm detection and containment , 2008, IEEE Communications Surveys & Tutorials.

[29]  Chun Wei,et al.  Detection of networks blocks used by the Storm Worm botnet , 2008, ACM-SE 46.

[30]  Grenville J. Armitage,et al.  A survey of techniques for internet traffic classification using machine learning , 2008, IEEE Communications Surveys & Tutorials.

[31]  Dai-sheng Luo,et al.  A New Attempt to Detect Polymorphic Worms Based on Semantic Signature and Data-Mining , 2006, 2006 First International Conference on Communications and Networking in China.