论文信息 - Anomalous subgraph detection via Sparse Principal Component Analysis

Anomalous subgraph detection via Sparse Principal Component Analysis

Network datasets have become ubiquitous in many fields of study in recent years. In this paper we investigate a problem with applicability to a wide variety of domains — detecting small, anomalous subgraphs in a background graph. We characterize the anomaly in a subgraph via the well-known notion of network modularity, and we show that the optimization problem formulation resulting from our setup is very similar to a recently introduced technique in statistics called Sparse Principal Component Analysis (Sparse PCA), which is an extension of the classical PCA algorithm. The exact version of our problem formulation is a hard combinatorial optimization problem, so we consider a recently introduced semidefinite programming relaxation of the Sparse PCA problem. We show via results on simulated data that the technique is very promising.

P. Wolfe | N. Bliss | Navraj Singh | B. Miller

[1] Anand Srivastav,et al. Finding Dense Subgraphs with Semidefinite Programming , 1998, APPROX.

[2] I. Jolliffe,et al. A Modified Principal Component Technique Based on the LASSO , 2003 .

[3] Michael I. Jordan,et al. A Direct Formulation for Sparse Pca Using Semidefinite Programming , 2004, NIPS 2004.

[4] R. Tibshirani,et al. Sparse Principal Component Analysis , 2006 .

[5] M. Newman,et al. Finding community structure in networks using the eigenvectors of matrices. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[6] Christos Faloutsos,et al. Graph mining: Laws, generators, and algorithms , 2006, CSUR.

[7] Ronny Luss,et al. DSPCA: a Toolbox for Sparse Principal Component Analysis , 2006 .

[8] S. Fortunato,et al. Resolution limit in community detection , 2006, Proceedings of the National Academy of Sciences.

[9] Alexandre d'Aspremont,et al. Optimal Solutions for Sparse Principal Component Analysis , 2007, J. Mach. Learn. Res..

[10] Ying Xuan,et al. Modularity-Maximizing Graph Communities via Mathematical Programming , 2009 .

[11] Patrick J. Wolfe,et al. Toward signal processing theory for graphs and non-Euclidean data , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[12] Patrick J. Wolfe,et al. Subgraph Detection Using Eigenvector L1 Norms , 2010, NIPS.

[13] Yurii Nesterov,et al. Generalized Power Method for Sparse Principal Component Analysis , 2008, J. Mach. Learn. Res..