Robust traffic anomaly detection with principal component pursuit

Principal component analysis (PCA) is a statistical technique that has been used for data analysis and dimensionality reduction. It was introduced as a network traffic anomaly detection technique firstly in [1]. Since then, a lot of research attention has been received, which results in an extensive analysis and several extensions. In [2], the sensitivity of PCA to its tuning parameters, such as the dimension of the low-rank subspace and the detection threshold, on traffic anomaly detection was indicated. However, no explanation on the underlying reasons of the problem was given in [2]. In [3], further investigation on the PCA sensitivity was conducted and it was found that the PCA sensitivity comes from the inability of PCA to detect temporal correlations. Based on this finding, an extension of PCA to Kalman-Loeve expansion (KLE) was proposed in [3]. While KLE shows slight improvement, it still exhibits similar sensitivity issue since a new tuning parameter called temporal correlation range was introduced. Recently, in [4], additional effort was paid to illustrate the PCA-poisoning problem. To underline this problem, an evading strategy called Boiled-Frog was proposed which adds a high fraction of outliers to the traffic. To defend against this, the authors employed a more robust version of PCA called PCA-GRID. While PCA-GRID shows performance improvement regarding the robustness to the outliers, it experiences a high sensitivity to the threshold estimate and the k-dimensional subspace that maximizes the dispersion of the data. The purpose of this work is to consider another technique to address the PCA poisoning problems to provide robust traffic anomaly detection: The Principal Component Pursuit.