Anomaly localization for network data streams with graph joint sparse PCA

Determining anomalies in data streams that are collected and transformed from various types of networks has recently attracted significant research interest. Principal Component Analysis (PCA) has been extensively applied to detecting anomalies in network data streams. However, none of existing PCA based approaches addresses the problem of identifying the sources that contribute most to the observed anomaly, or anomaly localization. In this paper, we propose novel sparse PCA methods to perform anomaly detection and localization for network data streams. Our key observation is that we can localize anomalies by identifying a sparse low dimensional space that captures the abnormal events in data streams. To better capture the sources of anomalies, we incorporate the structure information of the network stream data in our anomaly localization framework. We have performed comprehensive experimental studies of the proposed methods, and have compared our methods with the state-ofthe-art using three real-world data sets from different application domains. Our experimental studies demonstrate the utility of the proposed methods.

[1]  Xi Chen,et al.  Accelerated Gradient Method for Multi-task Sparse Learning Problem , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[2]  K. Schittkowski,et al.  NONLINEAR PROGRAMMING , 2022 .

[3]  Kenji Yamanishi,et al.  Network anomaly detection based on Eigen equation compression , 2009, KDD.

[4]  Ruy Luiz Milidiú,et al.  Data stream anomaly detection through principal subspace tracking , 2010, SAC '10.

[5]  Naoki Abe,et al.  Proximity-Based Anomaly Detection Using Sparse Structure Learning , 2009, SDM.

[6]  Mark Crovella,et al.  Diagnosing network-wide traffic anomalies , 2004, SIGCOMM '04.

[7]  Francis R. Bach,et al.  Structured Sparse Principal Component Analysis , 2009, AISTATS.

[8]  Rebecca Willett,et al.  Detection of anomalous meetings in a social network , 2008, 2008 42nd Annual Conference on Information Sciences and Systems.

[9]  Mark Crovella,et al.  Mining anomalies using traffic feature distributions , 2005, SIGCOMM '05.

[10]  Hongliang Fei,et al.  Boosting with structure information in the functional space: an application to graph classification , 2010, KDD.

[11]  Jieping Ye,et al.  Multi-Task Feature Learning Via Efficient l2, 1-Norm Minimization , 2009, UAI.

[12]  Olivier Cappé,et al.  Distributed detection/localization of change-points in high-dimensional network traffic data , 2009, Statistics and Computing.

[13]  Michael Gertz,et al.  ORDEN: outlier region detection and exploration in sensor networks , 2009, SIGMOD Conference.

[14]  Eamonn J. Keogh,et al.  UCR Time Series Data Mining Archive , 1983 .

[15]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[16]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[17]  R. Tibshirani,et al.  Sparse Principal Component Analysis , 2006 .

[18]  Ling Huang,et al.  In-Network PCA and Anomaly Detection , 2006, NIPS.

[19]  Haixun Wang,et al.  Online Anomaly Prediction for Robust Cluster Systems , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[20]  Jae-Gil Lee,et al.  Trajectory Outlier Detection: A Partition-and-Detect Framework , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[21]  Svetha Venkatesh,et al.  Effective Anomaly Detection in Sensor Networks Data Streams , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[22]  Ji Zhang,et al.  Anomaly detection in high-dimensional network data streams: A case study , 2008, 2008 IEEE International Conference on Intelligence and Security Informatics.

[23]  Jieping Ye,et al.  An accelerated gradient method for trace norm minimization , 2009, ICML '09.

[24]  Xindong Wu,et al.  Mining distribution change in stock order streams , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[25]  Yurii Nesterov,et al.  Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[26]  Spiros Papadimitriou,et al.  Computing Correlation Anomaly Scores Using Stochastic Nearest Neighbors , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[27]  Jun Huan,et al.  Anomaly Localization by Joint Sparse PCA in Wireless Sensor Networks , 2010 .

[28]  Deborah Estrin,et al.  A wireless sensor network For structural monitoring , 2004, SenSys '04.

[29]  Eamonn J. Keogh,et al.  HOT SAX: efficiently finding the most unusual time series subsequence , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[30]  Kavé Salamatian,et al.  Anomaly extraction in backbone networks using association rules , 2009, IMC '09.

[31]  Jennifer Rexford,et al.  Sensitivity of PCA for traffic anomaly detection , 2007, SIGMETRICS '07.

[32]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.