Context‐aided analysis of community evolution in networks

We are interested in detecting and analyzing global changes in dynamic networks (networks that evolve with time). More precisely, we consider changes in the activity distribution within the network, in terms of density (ie, edge existence) and intensity (ie, edge weight). Detecting change in local properties, as well as individual measurements or metrics, has been well studied and often reduces to traditional statistical process control. In contrast, detecting change in larger scale structure of the network is more challenging and not as well understood. We address this problem by proposing a framework for detecting change in network structure based on separate pieces: a probabilistic model for partitioning nodes by their behavior, a label-unswitching heuristic, and an approach to change detection for sequences of complex objects. We examine the performance of one instantiation of such a framework using mostly previously available pieces. The dataset we use for these investigations is the publicly available New York City Taxi and Limousine Commission dataset covering all taxi trips in New York City since 2009. Using it, we investigate the evolution of an ensemble of networks under different spatiotemporal resolutions. We identify the community structure by fitting a weighted stochastic block model. We offer insights on different node ranking and clustering methods, their ability to capture the rhythm of life in the Big Apple, and their potential usefulness in highlighting changes in the underlying network structure.

[1]  Aaron Clauset,et al.  Learning Latent Block Structure in Weighted Networks , 2014, J. Complex Networks.

[2]  Hao Chen,et al.  Graph-based change-point detection , 2012, 1209.1625.

[3]  M. Meilă Comparing clusterings---an information based distance , 2007 .

[4]  D. Hunter,et al.  Goodness of Fit of Social Network Models , 2008 .

[5]  Agostino Nobile,et al.  Bayesian finite mixtures with an unknown number of components: The allocation sampler , 2007, Stat. Comput..

[6]  Carl T. Bergstrom,et al.  Mapping Change in Large Networks , 2008, PloS one.

[7]  William H. Woodall,et al.  An overview and perspective on social network monitoring , 2016, ArXiv.

[8]  José M. F. Moura,et al.  Taxi data in New York city: A network perspective , 2015, 2015 49th Asilomar Conference on Signals, Systems and Computers.

[9]  P. Santhi Thilagam,et al.  Mining social networks for anomalies: Methods and challenges , 2016, J. Netw. Comput. Appl..

[10]  Roger Guimerà,et al.  Missing and spurious interactions and the reconstruction of complex networks , 2009, Proceedings of the National Academy of Sciences.

[11]  Xiuzhen Zhang,et al.  Anomaly detection in online social networks , 2014, Soc. Networks.

[12]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[13]  Franz Franchetti,et al.  Big data computation of taxi movement in New York City , 2016, 2016 IEEE International Conference on Big Data (Big Data).

[14]  Kathryn B. Laskey,et al.  Stochastic blockmodels: First steps , 1983 .

[15]  Carey E. Priebe,et al.  A Consistent Adjacency Spectral Embedding for Stochastic Blockmodel Graphs , 2011, 1108.2228.

[16]  P. Bickel,et al.  A nonparametric view of network models and Newman–Girvan and other modularities , 2009, Proceedings of the National Academy of Sciences.

[17]  Louis C Calcagno The administration of management planning and control : the New York City Taxi and Limousine Commission , 1979 .

[18]  Rory A. Fisher,et al.  The accuracy of the plating method of estimating the density of bacterial populations: with particular reference to the use of Thornton's agar medium with soil samples , 1922 .

[19]  E. S. Page Cumulative Sum Charts , 1961 .

[20]  F. Radicchi,et al.  Benchmark graphs for testing community detection algorithms. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[21]  Neil J. Hurley,et al.  Computational Statistics and Data Analysis , 2022 .

[22]  Cristopher Moore,et al.  Model selection for degree-corrected block models , 2012, Journal of statistical mechanics.

[23]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[24]  James Bailey,et al.  Information Theoretic Measures for Clusterings Comparison: Variants, Properties, Normalization and Correction for Chance , 2010, J. Mach. Learn. Res..

[25]  Eric D. Kolaczyk,et al.  Statistical Analysis of Network Data , 2009 .

[26]  S. Boorman,et al.  Social Structure from Multiple Networks. I. Blockmodels of Roles and Positions , 1976, American Journal of Sociology.

[27]  Vikas Kawadia,et al.  Sequential detection of temporal communities by estrangement confinement , 2012, Scientific Reports.

[28]  Edoardo M. Airoldi,et al.  Mixed Membership Stochastic Blockmodels , 2007, NIPS.

[29]  Nir Ailon,et al.  Aggregating inconsistent information: Ranking and clustering , 2008 .

[30]  Mark E. J. Newman,et al.  Stochastic blockmodels and community structure in networks , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[31]  Purnamrita Sarkar,et al.  Hypothesis testing for automated community detection in networks , 2013, ArXiv.

[32]  Cláudio T. Silva,et al.  Visual Exploration of Big Spatio-Temporal Urban Data: A Study of New York City Taxi Trips , 2013, IEEE Transactions on Visualization and Computer Graphics.