Machine Learning Approach for IP-Flow Record Anomaly Detection

Faced to continuous arising new threats, the detection of anomalies in current operational networks has become essential. Network operators have to deal with huge data volumes for analysis purpose. To counter this main issue, dealing with IP flow (also known as Netflow) records is common in network management. However, still in modern networks, Netflow records represent high volume of data. In this paper, we present an approach for evaluating Netflow records by referring to a method of temporal aggregation applied to Machine Learning techniques. We present an approach that leverages support vector machines in order to analyze large volumes of Netflow records. Our approach is using a special kernel function, that takes into account both the contextual and the quantitative information of Netflow records. We assess the viability of our method by practical experimentation on data volumes provided by a major internet service provider in Luxembourg.

[1]  Alexander Gelbukh,et al.  MICAI 2006: Advances in Artificial Intelligence, 5th Mexican International Conference on Artificial Intelligence, Apizaco, Mexico, November 13-17, 2006, Proceedings , 2006, MICAI.

[2]  Clifford A. Lynch,et al.  Information Networking , 1994 .

[3]  Michalis Faloutsos,et al.  BLINC: multilevel traffic classification in the dark , 2005, SIGCOMM '05.

[4]  Paramvir Bahl,et al.  Towards highly reliable enterprise network services via inference of multi-level dependencies , 2007, SIGCOMM.

[5]  Bernhard Schölkopf,et al.  Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[6]  Jun Murai,et al.  Characteristics of Denial of Service Attacks on Internet Using AGURI , 2003, ICOIN.

[7]  Xiaohong Guan,et al.  An SVM-based machine learning method for accurate internet traffic classification , 2010, Inf. Syst. Frontiers.

[8]  Pere Barlet-Ros,et al.  Portscan Detection with Sampled NetFlow , 2009, TMA.

[9]  Martin May,et al.  FLAME: A Flow-Level Anomaly Modeling Engine , 2008, CSET.

[10]  Michael Collins,et al.  Convolution Kernels for Natural Language , 2001, NIPS.

[11]  Mark Crovella,et al.  Mining anomalies using traffic feature distributions , 2005, SIGCOMM '05.

[12]  Dan Pei,et al.  Quantifying the Extent of IPv6 Deployment , 2009, PAM.

[13]  Radu State,et al.  PeekKernelFlows: peeking into IP flows , 2010, VizSec '10.

[14]  George Varghese,et al.  Building a better NetFlow , 2004, SIGCOMM.

[15]  George Varghese,et al.  Building a better NetFlow , 2004, SIGCOMM 2004.

[16]  Anja Feldmann,et al.  NetFlow: information loss or win? , 2002, IMW '02.

[17]  Jean-Philippe Vert A tree kernel to analyze phylog enetic profi les , 2002 .

[18]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[19]  Radu State,et al.  Game theory driven monitoring of spatial-aggregated IP-Flow records , 2010, 2010 International Conference on Network and Service Management.

[20]  Salvatore J. Stolfo,et al.  Mining in a data-flow environment: experience in network intrusion detection , 1999, KDD '99.

[21]  Dong Ho Song,et al.  Optimizing Weighted Kernel Function for Support Vector Machine by Genetic Algorithm , 2006, MICAI.

[22]  Dingxing Zhang,et al.  Using Support Vector Machine to Detect Unknown Computer Viruses , 2006 .

[23]  Bhavani M. Thuraisingham,et al.  A new intrusion detection system using support vector machines and hierarchical clustering , 2007, The VLDB Journal.

[24]  Aiko Pras,et al.  A Labeled Data Set for Flow-Based Intrusion Detection , 2009, IPOM.

[25]  Jean-Philippe Vert,et al.  A tree kernel to analyse phylogenetic profiles , 2002, ISMB.

[26]  A. Atiya,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2005, IEEE Transactions on Neural Networks.

[27]  Lipo Wang,et al.  Support Vector Machines: Theory and Applications (Studies in Fuzziness and Soft Computing) , 2005 .

[28]  Lipo Wang Support vector machines : theory and applications , 2005 .

[29]  Andrew S. Miner,et al.  Anomaly intrusion detection using one class SVM , 2004, Proceedings from the Fifth Annual IEEE SMC Information Assurance Workshop, 2004..