Botnet Detection System Analysis on the Effect of Botnet Evolution and Feature Representation

Botnets are known as one of the main destructive threats that have been active since 2003 in various forms. The ability to upgrade the structure and algorithms on the fly is part of what causes botnets to survive for more than a decade. Hence, one of the main concerns in designing a botnet detection system is how long such a system can be effective and useful considering the evolution of a given botnet. Furthermore, the data representation and the feature extraction components have always been an important issue in order to design a robust detection system. In this work, we employ machine learning algorithms (genetic programming and decision trees) to explore two questions: (i) How can the representation of non-numeric features effect the detection system's performance? and (ii) How long can a machine learning based detection system can perform effectively? To this end, we gathered seven Zeus botnet data sets over a period of four years and analyzed three different data representation techniques to be able to explore aforementioned questions.

[1]  Malcolm I. Heywood,et al.  Symbiosis, complexification and simplicity under GP , 2010, GECCO '10.

[2]  Amr M. Youssef,et al.  On the analysis of the Zeus botnet crimeware toolkit , 2010, 2010 Eighth International Conference on Privacy, Security and Trust.

[3]  Malcolm I. Heywood,et al.  Coevolutionary bid-based genetic programming for problem decomposition in classification , 2008, Genetic Programming and Evolvable Machines.

[4]  Ali A. Ghorbani,et al.  Towards effective feature selection in machine learning-based botnet detection approaches , 2014, 2014 IEEE Conference on Communications and Network Security.

[5]  A. Nur Zincir-Heywood,et al.  Analyzing string format-based classifiers for botnet detection: GP and SVM , 2013, 2013 IEEE Congress on Evolutionary Computation.

[6]  A. Nur Zincir-Heywood,et al.  On botnet behaviour analysis using GP and C4.5 , 2014, GECCO.

[7]  George Kesidis,et al.  Salting Public Traces with Attack Traffic to Test Flow Classifiers , 2011, CSET.

[8]  Guofei Gu,et al.  BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure-Independent Botnet Detection , 2008, USENIX Security Symposium.

[9]  A. Nur Zincir-Heywood,et al.  Data Confirmation for Botnet Traffic Analysis , 2014, FPS.

[10]  Tomáš Plesník,et al.  Detecting Botnets with NetFlow , 2011 .

[11]  A. Nur Zincir-Heywood,et al.  Benchmarking the Effect of Flow Exporters and Protocol Filters on Botnet Traffic Classification , 2016, IEEE Systems Journal.

[12]  M. Kubát An Introduction to Machine Learning , 2017, Springer International Publishing.

[13]  Robert Tibshirani,et al.  An Introduction to the Bootstrap , 1994 .

[14]  Chun-Ying Huang,et al.  A fuzzy pattern-based filtering algorithm for botnet detection , 2011, Comput. Networks.

[15]  Wolfgang Banzhaf,et al.  A comparison of linear genetic programming and neural networks in medical data mining , 2001, IEEE Trans. Evol. Comput..

[16]  W. Timothy Strayer,et al.  Botnet Detection Based on Network Behavior , 2008, Botnet Detection.