Anomaly detection of Internet traffic using robust feature selection based on kernel density estimation

Anomaly detection of Internet traffic is a network service of primary importance, given the constant threats that impinge on Internet security. From a statistical perspective, traffic anomalies can be considered outliers, and must be handled through effective outlier detection methods, for which feature selection is an important pre-processing step. Feature selection removes the redundant and irrelevant features from the detection process, increasing its performance. In this work, we consider outlier detection based on principal component analysis, and feature selection based on mutual information. Moreover, we address the use of kernel density estimation (KDE) to estimate themutual information, which is designed for continuous features, and avoids the discretization step of histograms. Our results, obtained using a high-quality ground-truth, clearly show the usefulness of feature selection and the superiority of KDE to estimate the mutual information, in the context of Internet traffic anomaly detection.

[1]  Yan Li,et al.  Estimation of Mutual Information: A Survey , 2009, RSKT.

[2]  P. Filzmoser,et al.  Algorithms for Projection-Pursuit Robust Principal Component Analysis , 2007 .

[3]  Wenjie Hu,et al.  Robust Anomaly Detection Using Support Vector Machines , 2003 .

[4]  Clayton D. Scott,et al.  Robust kernel density estimation , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[5]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[6]  S. Saigal,et al.  Relative performance of mutual information estimation methods for quantifying the dependence among short and noisy data. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[7]  Konstantina Papagiannaki,et al.  Structural analysis of network traffic flows , 2004, SIGMETRICS '04/Performance '04.

[8]  M. Shyu,et al.  A Novel Anomaly Detection Scheme Based on Principal Component Classifier , 2003 .

[9]  Ling Huang,et al.  ANTIDOTE: understanding and defending against poisoning of anomaly detectors , 2009, IMC '09.

[10]  Erik Schaffernicht,et al.  On Estimating Mutual Information for Feature Selection , 2010, ICANN.

[11]  Iman Tavakkolnia,et al.  2016 EUROPEAN CONFERENCE ON NETWORKS AND COMMUNICATIONS (EUCNC) , 2015 .

[12]  Moon,et al.  Estimation of mutual information using kernel density estimators. , 1995, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[13]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[14]  V. Rao Vemuri,et al.  Robust Support Vector Machines for Anomaly Detection in Computer Security , 2003, ICMLA.

[15]  Farnam Jahanian,et al.  A comparative study of two network-based anomaly detection methods , 2011, 2011 Proceedings IEEE INFOCOM.

[16]  P. J. Green,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[17]  Martin May,et al.  Applying PCA for Traffic Anomaly Detection: Problems and Solutions , 2009, IEEE INFOCOM 2009.

[18]  Thomas M. Cover,et al.  Elements of information theory (2. ed.) , 2006 .

[19]  Christophe Croux,et al.  High breakdown estimators for principal components: the projection-pursuit approach revisited , 2005 .

[20]  Matthew P. Wand,et al.  Kernel Smoothing , 1995 .