Goodness-of-fit statistics for anomaly detection in Chung-Lu random graphs

Anomaly detection in graphs is a relevant problem in numerous applications. When determining whether an observation is anomalous with respect to the model of typical behavior, the notion of “goodness of fit” is important. This notion, however, is not well-understood in the context of graph data. In this paper, we propose three goodness-of-fit statistics for Chung-Lu random graphs, and analyze their efficacy in discriminating graphs generated by the Chung-Lu model from those with anomalous topologies. In the results of a Monte Carlo simulation, we see that the most powerful statistic for anomaly detection depends on the type of anomaly, suggesting that a hybrid statistic would be the most powerful.

[1]  M. Newman,et al.  Finding community structure in networks using the eigenvectors of matrices. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[2]  Eric D. Kolaczyk,et al.  Statistical Analysis of Network Data , 2009 .

[3]  Fan Chung Graham,et al.  The Spectra of Random Graphs with Given Expected Degrees , 2004, Internet Math..

[4]  Christos Faloutsos,et al.  R-MAT: A Recursive Model for Graph Mining , 2004, SDM.

[5]  Diane J. Cook,et al.  Graph-based anomaly detection , 2003, KDD '03.

[6]  Patrick J. Wolfe,et al.  Toward signal processing theory for graphs and non-Euclidean data , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[7]  Patrick J. Wolfe,et al.  Subgraph Detection Using Eigenvector L1 Norms , 2010, NIPS.

[8]  F. Chung,et al.  Spectra of random graphs with given expected degrees , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Eric D. Kolaczyk,et al.  Statistical Analysis of Network Data: Methods and Models , 2009 .