Multiple Network Embedding for Anomaly Detection in Time Series of Graphs.

This paper considers the graph signal processing problem of anomaly detection in time series of graphs. We examine two related, complementary inference tasks: the detection of anomalous graphs within a time series, and the detection of temporally anomalous vertices. We approach these tasks via the adaptation of statistically principled methods for joint graph inference, specifically multiple adjacency spectral embedding (MASE) and omnibus embedding (OMNI). We demonstrate that these two methods are effective for our inference tasks. Moreover, we assess the performance of these methods in terms of the underlying nature of detectable anomalies. Our results delineate the relative strengths and limitations of these procedures, and provide insight into their use. Applied to a large-scale commercial search engine time series of graphs, our approaches demonstrate their applicability and identify the anomalous vertices beyond just large degree change.

[1]  M. Tang,et al.  Limit results for distributed estimation of invariant subspaces in multiple networks inference and PCA , 2022, 2206.04306.

[2]  E. Levina,et al.  Latent space models for multiplex networks with shared structure , 2020, 2012.14409.

[3]  Sharmodeep Bhattacharyya,et al.  Consistent detection and optimal localization of all detectable change points in piecewise stationary arbitrarily sparse network-sequences , 2020, ArXiv.

[4]  V. Lyzinski,et al.  The Importance of Being Correlated: Implications of Dependence in Joint Spectral Inference across Multiple Networks , 2020, J. Mach. Learn. Res..

[5]  Zhen He,et al.  Monitoring binary networks for anomalous communication patterns based on the structural statistics , 2020, Comput. Ind. Eng..

[6]  D. Sussman,et al.  Bias-Variance Tradeoffs in Joint Spectral Embeddings , 2020, 2005.02511.

[7]  Jing Lei,et al.  Consistent community detection in multi-layer network data , 2019, Biometrika.

[8]  Ting Li,et al.  Community Detection on Mixture Multi-layer Networks via Regularized Tensor Decomposition , 2020, The Annals of Statistics.

[9]  Oscar Hernan Madrid Padilla,et al.  Change point localization in dependent dynamic nonparametric random dot product graphs , 2019, J. Mach. Learn. Res..

[10]  Lizhen Lin,et al.  Change-point detection in dynamic networks via graphon estimation , 2019, 1908.01823.

[11]  Carey E. Priebe,et al.  Inference for Multiple Heterogeneous Networks with a Common Invariant Subspace , 2019, J. Mach. Learn. Res..

[12]  P. Wolfe,et al.  Modeling Network Populations via Graph Distances , 2019, Journal of the American Statistical Association.

[13]  George Michailidis,et al.  Change Point Estimation in a Dynamic Stochastic Block Model , 2018, J. Mach. Learn. Res..

[14]  A. Rinaldo,et al.  Optimal change point detection and localization in sparse dynamic networks , 2018, The Annals of Statistics.

[15]  Sharmodeep Bhattacharyya,et al.  Spectral Clustering for Multiple Sparse Networks: I , 2018, ArXiv.

[16]  Carey E. Priebe,et al.  A Central Limit Theorem for an Omnibus Embedding of Multiple Random Dot Product Graphs , 2017, 2017 IEEE International Conference on Data Mining Workshops (ICDMW).

[17]  Carey E. Priebe,et al.  A statistical interpretation of spectral embedding: The generalised random dot product graph , 2017, Journal of the Royal Statistical Society: Series B (Statistical Methodology).

[18]  Carey E. Priebe,et al.  Statistical Inference on Random Dot Product Graphs: a Survey , 2017, J. Mach. Learn. Res..

[19]  Zhengwu Zhang,et al.  Common and individual structure of brain networks , 2017, The Annals of Applied Statistics.

[20]  C. Priebe,et al.  A central limit theorem for an omnibus embedding of random dot product graphs , 2017, 1705.09355.

[21]  Yuguo Chen,et al.  Spectral and matrix factorization methods for consistent community detection in multi-layer networks , 2017, 1704.07353.

[22]  C. Priebe,et al.  A Semiparametric Two-Sample Hypothesis Testing Problem for Random Graphs , 2017 .

[23]  Emmanuel Abbe,et al.  Community detection and stochastic block models: recent developments , 2017, Found. Trends Commun. Inf. Theory.

[24]  Dong Wang,et al.  Distributed estimation of principal eigenspaces. , 2017, Annals of statistics.

[25]  Michael D. Ward,et al.  Inferential Approaches for Network Analysis: AMEN for Latent Factor Models , 2016, Political Analysis.

[26]  Xiao Zhang,et al.  Random graph models for dynamic networks , 2016, The European Physical Journal B.

[27]  Marianna Pensky,et al.  Dynamic network models and graphon estimation , 2016, The Annals of Statistics.

[28]  Danai Koutra,et al.  DeltaCon: Principled Massive-Graph Similarity Function with Attribution , 2016, ACM Trans. Knowl. Discov. Data.

[29]  Vincent Miele,et al.  Statistical clustering of temporal networks through a dynamic stochastic block model , 2015, 1506.07464.

[30]  Daniele Durante,et al.  Locally Adaptive Dynamic Networks , 2015, 1505.05668.

[31]  Steve Harenberg,et al.  Anomaly detection in dynamic networks: a survey , 2015 .

[32]  Carey E. Priebe,et al.  Community Detection and Classification in Hierarchical Stochastic Blockmodels , 2015, IEEE Transactions on Network Science and Engineering.

[33]  Kevin S. Xu Stochastic Block Transition Models for Dynamic Networks , 2014, AISTATS.

[34]  Edoardo M. Airoldi,et al.  Consistent estimation of dynamic and multi-layer block models , 2014, ICML.

[35]  Daniele Durante,et al.  Nonparametric Bayes Modeling of Populations of Networks , 2014, 1406.7851.

[36]  George Michailidis,et al.  Change point estimation in high dimensional Markov random‐field models , 2014, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[37]  Danai Koutra,et al.  Graph based anomaly detection and description: a survey , 2014, Data Mining and Knowledge Discovery.

[38]  Leto Peel,et al.  Detecting Change Points in the Large-Scale Structure of Evolving Networks , 2014, AAAI.

[39]  Curtis B. Storlie,et al.  Scan Statistics for the Online Detection of Locally Anomalous Subgraphs , 2013, Technometrics.

[40]  Heng Wang,et al.  Locality Statistics for Anomaly Detection in Time Series of Graphs , 2013, IEEE Transactions on Signal Processing.

[41]  Danai Koutra,et al.  DELTACON: A Principled Massive-Graph Similarity Function , 2013, SDM.

[42]  Pascal Frossard,et al.  Clustering on Multi-Layer Graphs via Subspace Analysis on Grassmann Manifolds , 2013, IEEE Transactions on Signal Processing.

[43]  Ryan A. Rossi,et al.  Modeling dynamic behavior in large evolving graphs , 2013, WSDM.

[44]  Carey E. Priebe,et al.  Anomaly Detection in Time Series of Graphs using Fusion of Graph Invariants , 2012, IEEE Journal of Selected Topics in Signal Processing.

[45]  Yizhou Sun,et al.  Community Trend Outlier Detection Using Soft Temporal Pattern Mining , 2012, ECML/PKDD.

[46]  Nagiza F. Samatova,et al.  Community-based anomaly detection in evolutionary networks , 2012, Journal of Intelligent Information Systems.

[47]  Linyuan Lu,et al.  Spectra of Edge-Independent Random Graphs , 2012, Electron. J. Comb..

[48]  Carey E. Priebe,et al.  A Consistent Adjacency Spectral Embedding for Stochastic Blockmodel Graphs , 2011, 1108.2228.

[49]  Philip S. Yu,et al.  Outlier detection in graph streams , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[50]  D. Hand,et al.  Bayesian anomaly detection methods for social networks , 2010, 1011.1788.

[51]  Edoardo M. Airoldi,et al.  A Survey of Statistical Network Models , 2009, Found. Trends Mach. Learn..

[52]  Zhengding Lu,et al.  Community mining on dynamic weighted directed graphs , 2009, CIKM-CNIKM.

[53]  Dell Zhang,et al.  Proceedings of the 1st ACM international workshop on Complex networks meet information & knowledge management , 2009, CIKM 2009.

[54]  R. Oliveira Concentration of the adjacency matrix and of the Laplacian in random graphs with independent edges , 2009, 0911.0600.

[55]  Sanjeev Arora,et al.  Computational Complexity: A Modern Approach , 2009 .

[56]  E. Xing,et al.  A state-space mixed membership blockmodel for dynamic network tomography , 2008, 0901.0135.

[57]  Yun Chi,et al.  Evolutionary spectral clustering by incorporating temporal smoothness , 2007, KDD '07.

[58]  Edoardo M. Airoldi,et al.  Mixed Membership Stochastic Blockmodels , 2007, NIPS.

[59]  Christos Faloutsos,et al.  Netprobe: a fast and scalable system for fraud detection in online auction networks , 2007, WWW '07.

[60]  Jimeng Sun,et al.  Less is More: Compact Matrix Decomposition for Large Sparse Graphs , 2007, SDM.

[61]  Fred Spiring,et al.  Introduction to Statistical Quality Control , 2007, Technometrics.

[62]  Philip S. Yu,et al.  Window-based Tensor Analysis on High-dimensional and Multi-aspect Streams , 2006, Sixth International Conference on Data Mining (ICDM'06).

[63]  Mu Zhu,et al.  Automatic dimensionality selection from the scree plot via the use of profile likelihood , 2006, Comput. Stat. Data Anal..

[64]  Jimeng Sun,et al.  Beyond streams and graphs: dynamic tensor analysis , 2006, KDD '06.

[65]  Andrzej Rucinski,et al.  Random Graphs , 2018, Foundations of Data Science.

[66]  David J. Marchette,et al.  Scan Statistics on Enron Graphs , 2005, Comput. Math. Organ. Theory.

[67]  Mark Crovella,et al.  Diagnosing network-wide traffic anomalies , 2004, SIGCOMM '04.

[68]  Hisashi Kashima,et al.  Eigenspace-based anomaly detection in computer systems , 2004, KDD.

[69]  Peter D. Hoff,et al.  Latent Space Approaches to Social Network Analysis , 2002 .

[70]  Bhaskar Bhattacharya,et al.  Median of the p Value Under the Alternative Hypothesis , 2002 .

[71]  Kathryn B. Laskey,et al.  Stochastic blockmodels: First steps , 1983 .

[72]  Irving W. Burr,et al.  Control Charts for Measurements with Varying Sample Sizes , 1969 .

[73]  R. F.,et al.  Statistical Method from the Viewpoint of Quality Control , 1940, Nature.

[74]  Noureddine El Karoui,et al.  Can We Trust the Bootstrap in High-dimensions? The Case of Linear Models , 2018, J. Mach. Learn. Res..

[75]  Charu C. Aggarwal,et al.  Event Detection in Social Streams , 2012, SDM.

[76]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[77]  G. Barrie Wetherill,et al.  Statistical Process Control , 1991 .