Statistical evaluation of spectral methods for anomaly detection in static networks

The topic of anomaly detection in networks has attracted a lot of attention in recent years, especially with the rise of connected devices and social networks. Anomaly detection spans a wide range of applications, from detecting terrorist cells in counter-terrorism efforts to identifying unexpected mutations during RNA transcription. Fittingly, numerous algorithmic techniques for anomaly detection have been introduced. However, to date, little work has been done to evaluate these algorithms from a statistical perspective. This work is aimed at addressing this gap in the literature by carrying out statistical evaluation of a suite of popular spectral methods for anomaly detection in networks. Our investigation on statistical properties of these algorithms reveals several important and critical shortcomings that we make methodological improvements to address. Further, we carry out a performance evaluation of these algorithms using simulated networks and extend the methods from binary to count networks.

[1]  Xiuzhen Zhang,et al.  Anomaly detection in online social networks , 2014, Soc. Networks.

[2]  David A. Bader,et al.  SNAP, Small-world Network Analysis and Partitioning: An open-source parallel graph framework for the exploration of large-scale networks , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[3]  Saurabh Amin,et al.  Network Monitoring under Strategic Disruptions , 2017, ArXiv.

[4]  P. Wolfe,et al.  Anomalous subgraph detection via Sparse Principal Component Analysis , 2011, 2011 IEEE Statistical Signal Processing Workshop (SSP).

[5]  Ivica Letunic,et al.  Visualization of multiple alignments, phylogenies and gene family evolution , 2010, Nature Methods.

[6]  Christos Faloutsos,et al.  LOCI: fast outlier detection using the local correlation integral , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[7]  William H. Woodall,et al.  An overview and perspective on social network monitoring , 2016, ArXiv.

[8]  Fan Chung Graham,et al.  A Random Graph Model for Power Law Graphs , 2001, Exp. Math..

[9]  Steve Harenberg,et al.  Anomaly detection in dynamic networks: a survey , 2015 .

[10]  Tai Qin,et al.  Regularized Spectral Clustering under the Degree-Corrected Stochastic Blockmodel , 2013, NIPS.

[11]  Mark E. J. Newman,et al.  Community detection in networks: Modularity optimization and maximum likelihood are equivalent , 2016, ArXiv.

[12]  George C. Runger,et al.  Monitoring Temporal Homogeneity in Attributed Network Streams , 2016 .

[13]  Sanjay Chawla,et al.  SLOM: a new measure for local spatial outliers , 2006, Knowledge and Information Systems.

[14]  Jimeng Sun,et al.  Neighborhood formation and anomaly detection in bipartite graphs , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[15]  Philip S. Yu,et al.  Identify Online Store Review Spammers via Social Review Graph , 2012, TIST.

[16]  Taher H. Haveliwala Topic-Sensitive PageRank: A Context-Sensitive Ranking Algorithm for Web Search , 2003, IEEE Trans. Knowl. Data Eng..

[17]  S. Nadarajah,et al.  The beta Gumbel distribution , 2004 .

[18]  David J. Marchette,et al.  Scan Statistics on Enron Graphs , 2005, Comput. Math. Organ. Theory.

[19]  Johan A. K. Suykens,et al.  Kernel Spectral Clustering for Big Data Networks , 2013, Entropy.

[20]  Réka Albert,et al.  Structural vulnerability of the North American power grid. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[21]  Vydunas Saltenis,et al.  Outlier Detection Based on the Distribution of Distances between Data Points , 2004, Informatica.

[22]  Christos Faloutsos,et al.  R-MAT: A Recursive Model for Graph Mining , 2004, SDM.

[23]  Baur,et al.  Analysis of T‐cell reactive regions and HLA‐DR4 binding motifs on the latex allergen Hev b 1 (rubber elongation factor) , 1998, Clinical and experimental allergy : journal of the British Society for Allergy and Clinical Immunology.

[24]  Danai Koutra,et al.  Graph based anomaly detection and description: a survey , 2014, Data Mining and Knowledge Discovery.

[25]  Bin Yu,et al.  Spectral clustering and the high-dimensional stochastic blockmodel , 2010, 1007.1684.

[26]  F. Chung,et al.  Spectra of random graphs with given expected degrees , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[27]  Srijan Sengupta,et al.  Anomaly detection in static networks using egonets , 2018, 1807.08925.

[28]  Kevin H. Bruce,et al.  Introducing the non-B DNA Motif Search Tool (nBMST) , 2011, Genome Biology.

[29]  Srijan Sengupta,et al.  SPECTRAL CLUSTERING IN HETEROGENEOUS NETWORKS , 2015 .

[30]  Christos Faloutsos,et al.  oddball: Spotting Anomalies in Weighted Graphs , 2010, PAKDD.

[31]  Patrick J. Wolfe,et al.  Subgraph Detection Using Eigenvector L1 Norms , 2010, NIPS.

[32]  Kevin H. Bruce,et al.  Searching for Non‐B DNA‐Forming Motifs Using nBMST (Non‐B DNA Motif Search Tool) , 2012, Current protocols in human genetics.

[33]  A. Rinaldo,et al.  Consistency of spectral clustering in stochastic block models , 2013, 1312.2050.

[34]  R. Noorossana,et al.  A statistical approach to social network monitoring , 2017 .

[35]  Patrick J. Wolfe,et al.  Toward signal processing theory for graphs and non-Euclidean data , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[36]  Patrick J. Wolfe,et al.  A Spectral Framework for Anomalous Subgraph Detection , 2014, IEEE Transactions on Signal Processing.

[37]  P. Erdos,et al.  On the evolution of random graphs , 1984 .