Change detection in noisy dynamic networks: a spectral embedding approach

Change detection in dynamic networks is an important problem in many areas, such as fraud detection, cyber intrusion detection and healthcare monitoring. It is a challenging problem because it involves a time sequence of graphs, each of which is usually very large and sparse with heterogeneous vertex degrees, resulting in a complex, high-dimensional mathematical object. Spectral embedding methods provide an effective way to transform a graph to a lower dimensional latent Euclidean space that preserves the underlying structure of the network. Although change detection methods that use spectral embedding are available, they do not address sparsity and degree heterogeneity that usually occur in noisy real-world graphs and a majority of these methods focus on changes in the behaviour of the overall network. In this paper, we adapt previously developed techniques in spectral graph theory and propose a novel concept of applying Procrustes techniques to embedded points for vertices in a graph to detect changes in entity behaviour. Our spectral embedding approach not only addresses sparsity and degree heterogeneity issues, but also obtains an estimate of the appropriate embedding dimension. We call this method CDP (change detection using Procrustes analysis). We demonstrate the performance of CDP through extensive simulation experiments and a real-world application. CDP successfully detects various types of vertex-based changes including (1) changes in vertex degree, (2) changes in community membership of vertices, and (3) unusual increase or decrease in edge weights between vertices. The change detection performance of CDP is compared with two other baseline methods that employ alternative spectral embedding approaches. In both cases, CDP generally shows superior performance.

[1]  Ryan A. Rossi,et al.  Modeling dynamic behavior in large evolving graphs , 2013, WSDM.

[2]  Neminath Hubballi,et al.  SpamDetector: Detecting spam callers in Voice over Internet Protocol with graph anomalies , 2018, Secur. Priv..

[3]  Simon De Ridder,et al.  Detection and localization of change points in temporal networks with the aid of stochastic block models , 2016, ArXiv.

[4]  Nikos D. Sidiropoulos,et al.  ParCube: Sparse Parallelizable Tensor Decompositions , 2012, ECML/PKDD.

[5]  Dimitris Achlioptas,et al.  Fast computation of low rank matrix approximations , 2001, STOC '01.

[6]  Yizhou Sun,et al.  Integrating community matching and outlier detection for mining evolutionary community outliers , 2012, KDD.

[7]  M. B. Stegmann,et al.  A Brief Introduction to Statistical Shape Analysis , 2002 .

[8]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[9]  Kun Huang,et al.  A unifying theorem for spectral embedding and clustering , 2003, AISTATS.

[10]  Mohammed Al-Shalalfa,et al.  Prediction of novel drug indications using network driven biological data prioritization and integration , 2014, Journal of Cheminformatics.

[11]  Yi Yu,et al.  Link prediction for interdisciplinary collaboration via co-authorship network , 2018, Social Network Analysis and Mining.

[12]  Vipin Kumar UNDERSTANDING COMPLEX DATASETS: DATA MINING WITH MATRIX DECOMPOSITIONS , 2006 .

[13]  Mark E. J. Newman,et al.  Power-Law Distributions in Empirical Data , 2007, SIAM Rev..

[14]  Hisashi Kashima,et al.  Eigenspace-based anomaly detection in computer systems , 2004, KDD.

[15]  K. Mardia,et al.  Statistical Shape Analysis , 1998 .

[16]  David Poole,et al.  Linear Algebra: A Modern Introduction , 2002 .

[17]  Isuru Udayangani Hewapathirana,et al.  Change detection in dynamic attributed networks , 2018, Wiley Interdiscip. Rev. Data Min. Knowl. Discov..

[18]  Romain Couillet,et al.  Improved spectral community detection in large heterogeneous networks , 2017, J. Mach. Learn. Res..

[19]  Yan Liu,et al.  DynGEM: Deep Embedding Method for Dynamic Graphs , 2018, ArXiv.

[20]  Timothy F. Cootes,et al.  Training Models of Shape from Sets of Examples , 1992, BMVC.

[21]  Colin Cooper,et al.  Randomization and Approximation Techniques in Computer Science , 1999, Lecture Notes in Computer Science.

[22]  Anuj Srivastava,et al.  Statistical Shape Analysis , 2014, Computer Vision, A Reference Guide.

[23]  Xintao Wu,et al.  Dynamic Anomaly Detection Using Vector Autoregressive Model , 2019, PAKDD.

[24]  Philip S. Yu,et al.  Window-based Tensor Analysis on High-dimensional and Multi-aspect Streams , 2006, Sixth International Conference on Data Mining (ICDM'06).

[25]  Huan Liu,et al.  Community detection via heterogeneous interaction analysis , 2012, Data Mining and Knowledge Discovery.

[26]  Radu Grosu,et al.  HellRank: a Hellinger-based centrality measure for bipartite social networks , 2016, Social Network Analysis and Mining.

[27]  Bin Yu,et al.  Impact of regularization on spectral clustering , 2013, 2014 Information Theory and Applications Workshop (ITA).

[28]  Huan Liu,et al.  Community evolution in dynamic multi-mode networks , 2008, KDD.

[29]  M. A. Iwen,et al.  A Distributed and Incremental SVD Algorithm for Agglomerative Data Analysis on Large Networks , 2016, SIAM J. Matrix Anal. Appl..

[30]  U. Feige,et al.  Spectral Graph Theory , 2015 .

[31]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[32]  Kwok Leung Tsui,et al.  Monitoring dynamic networks: A simulation‐based strategy for comparing monitoring methods and a comparative study , 2019, Qual. Reliab. Eng. Int..

[33]  Kwok-Leung Tsui,et al.  Detecting node propensity changes in the dynamic degree corrected stochastic block model , 2018, Soc. Networks.

[34]  C. W. Thomas,et al.  The Rise and Fall of Enron; When a Company Looks Too Good to Be True, It Usually Is , 2002 .

[35]  Jimeng Sun,et al.  Less is More: Sparse Graph Mining with Compact Matrix Decomposition , 2008, Stat. Anal. Data Min..

[36]  Sheng Fang Distributed computing of large-scale singular value decompositions , 2018 .

[37]  Xiao Zhang,et al.  Localization and centrality in networks , 2014, Physical review. E, Statistical, nonlinear, and soft matter physics.

[38]  François Fouss,et al.  The Principal Components Analysis of a Graph, and Its Relationships to Spectral Clustering , 2004, ECML.

[39]  Hector Garcia-Molina,et al.  Web graph similarity for anomaly detection , 2010, Journal of Internet Services and Applications.

[40]  Mark E. J. Newman,et al.  Stochastic blockmodels and community structure in networks , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[41]  Raúl V. Ramírez-Velarde,et al.  A Parallel Implementation of Singular Value Decomposition for Video-on-demand Services Design Using Principal Component Analysis , 2014, ICCS.

[42]  Fan Chung,et al.  Spectral Graph Theory , 1996 .

[43]  Srinivasan Parthasarathy,et al.  Fast Change Point Detection on Dynamic Social Networks , 2017, IJCAI.

[44]  Donald A. Jackson STOPPING RULES IN PRINCIPAL COMPONENTS ANALYSIS: A COMPARISON OF HEURISTICAL AND STATISTICAL APPROACHES' , 1993 .

[45]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[46]  Peter J. Bickel,et al.  Pseudo-likelihood methods for community detection in large sparse networks , 2012, 1207.2340.

[47]  Christos H. Papadimitriou,et al.  On the Eigenvalue Power Law , 2002, RANDOM.

[48]  D. Hand,et al.  Bayesian anomaly detection methods for social networks , 2010, 1011.1788.

[49]  Yiming Yang,et al.  Introducing the Enron Corpus , 2004, CEAS.

[50]  Spiros Papadimitriou,et al.  Computing Correlation Anomaly Scores Using Stochastic Nearest Neighbors , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[51]  C. Faloutsos,et al.  EVENT DETECTION IN TIME SERIES OF MOBILE COMMUNICATION GRAPHS , 2010 .

[52]  David J. Marchette,et al.  Scan Statistics on Enron Graphs , 2005, Comput. Math. Organ. Theory.

[53]  Suchismita Goswami,et al.  Network Neighborhood Analysis For Detecting Anomalies in Time Series of Graphs , 2019 .

[54]  Argyris Kalogeratos,et al.  A Probabilistic Framework to Node-level Anomaly Detection in Communication Networks , 2019, IEEE INFOCOM 2019 - IEEE Conference on Computer Communications.

[55]  Kumar Sricharan,et al.  Localizing anomalous changes in time-evolving graphs , 2014, SIGMOD Conference.

[56]  Curtis B. Storlie,et al.  Scan Statistics for the Online Detection of Locally Anomalous Subgraphs , 2013, Technometrics.

[57]  Leto Peel,et al.  Detecting Change Points in the Large-Scale Structure of Evolving Networks , 2014, AAAI.

[58]  Jack Dongarra,et al.  LAPACK: a portable linear algebra library for high-performance computers , 1990, SC.

[59]  Mimi Swartz,et al.  Power failure : the rise and fall of Enron , 2003 .

[60]  Srijan Sengupta,et al.  SPECTRAL CLUSTERING IN HETEROGENEOUS NETWORKS , 2015 .

[61]  C. Nickel RANDOM DOT PRODUCT GRAPHS A MODEL FOR SOCIAL NETWORKS , 2008 .

[62]  Peter D. Hoff,et al.  Latent Space Approaches to Social Network Analysis , 2002 .