Network dependence testing via diffusion maps and distance-based correlations

Deciphering the associations between network connectivity and nodal attributes is one of the core problems in network science. The dependency structure and high dimensionality of networks pose unique challenges to traditional dependency tests in terms of theoretical guarantees and empirical performance. We propose an approach to test network dependence via diffusion maps and distance-based correlations. We prove that the new method yields a consistent test statistic under mild distributional assumptions on the graph structure, and demonstrate that it is able to efficiently identify the most informative graph embedding with respect to the diffusion time. The methodology is illustrated on both simulated and real data.

[1]  Andrzej Rucinski,et al.  Random Graphs , 2018, Foundations of Data Science.

[2]  J. Vogelstein,et al.  The exact equivalence of distance and kernel methods in hypothesis testing , 2018, AStA Advances in Statistical Analysis.

[3]  C. Priebe,et al.  From Distance Correlation to Multiscale Graph Correlation , 2017, Journal of the American Statistical Association.

[4]  Peter Orbanz,et al.  Subsampling large graphs and invariance in networks , 2017, 1710.04217.

[5]  C. Priebe,et al.  A Semiparametric Two-Sample Hypothesis Testing Problem for Random Graphs , 2017 .

[6]  Y. Shiferaw,et al.  Nonlinear signaling on biological networks: The role of stochasticity and spectral clustering. , 2017, Physical review. E.

[7]  Brian Litt,et al.  Science in the cloud (SIC): A use case in MRI connectomics , 2016, GigaScience.

[8]  Eric W. Bridgeford,et al.  Discovering and deciphering relationships across disparate data modalities , 2016, eLife.

[9]  Carey E. Priebe,et al.  Discovering Relationships Across Disparate Data Modalities , 2016 .

[10]  Leto Peel,et al.  The ground truth about metadata and community detection in networks , 2016, Science Advances.

[11]  Gregory Kiar,et al.  ndmg: NeuroData's MRI Graphs pipeline , 2016 .

[12]  Arthur Gretton,et al.  Large-scale kernel methods for independence testing , 2016, Statistics and Computing.

[13]  H. Chipman,et al.  A Continuous-time Stochastic Block Model for Basketball Networks , 2015, 1507.01816.

[14]  Carey E. Priebe,et al.  Manifold matching using shortest-path distance and joint neighborhood selection , 2014, Pattern Recognit. Lett..

[15]  Malka Gorfine,et al.  Consistent Distribution-Free $K$-Sample and Independence Tests for Univariate Random Variables , 2014, J. Mach. Learn. Res..

[16]  Aurélien Garivier,et al.  On the Complexity of Best-Arm Identification in Multi-Armed Bandit Models , 2014, J. Mach. Learn. Res..

[17]  C. Priebe,et al.  A nonparametric two-sample hypothesis testing problem for random dot product graphs , 2014, 1409.2344.

[18]  Daniel M. Roy,et al.  Bayesian Models of Graphs, Arrays and Other Exchangeable Random Structures , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  A. Rinaldo,et al.  Consistency of spectral clustering in stochastic block models , 2013, 1312.2050.

[20]  C. Priebe,et al.  Robust Vertex Classification , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Maria L. Rizzo,et al.  Partial Distance Correlation with Methods for Dissimilarities , 2013, 1310.2926.

[22]  Peter D Hoff,et al.  Testing and Modeling Dependencies Between a Network and Nodal Attributes , 2013, Journal of the American Statistical Association.

[23]  Gábor J. Székely,et al.  The distance correlation t-test of independence in high dimension , 2013, J. Multivar. Anal..

[24]  Yong He,et al.  Coupling of functional connectivity and regional cerebral blood flow reveals a physiological basis for network hubs of the human brain , 2013, Proceedings of the National Academy of Sciences.

[25]  C. Priebe,et al.  Universally consistent vertex classification for latent positions graphs , 2012, 1212.1182.

[26]  Carey E. Priebe,et al.  Universally Consistent Latent Position Estimation and Vertex Classification for Random Dot Product Graphs , 2012, 1207.6745.

[27]  Bharath K. Sriperumbudur,et al.  Equivalence of distance-based and RKHS-based statistics in hypothesis testing , 2012, ArXiv.

[28]  Runze Li,et al.  Feature Screening via Distance Correlation Learning , 2012, Journal of the American Statistical Association.

[29]  R. Heller,et al.  A consistent multivariate test of association based on ranks of distances , 2012, 1201.3522.

[30]  Kevin Lewis,et al.  Social selection and peer influence in an online social network , 2011, Proceedings of the National Academy of Sciences.

[31]  Gilles Guillot,et al.  Dismantling the Mantel tests , 2011, 1112.0651.

[32]  Ji Zhu,et al.  Consistency of community detection in networks under degree-corrected stochastic block models , 2011, 1110.3854.

[33]  Ji Zhu,et al.  On Consistency of Community Detection in Networks , 2011, ArXiv.

[34]  Carey E. Priebe,et al.  A Consistent Adjacency Spectral Embedding for Stochastic Blockmodel Graphs , 2011, 1108.2228.

[35]  Mark E. J. Newman,et al.  Stochastic blockmodels and community structure in networks , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[36]  Bin Yu,et al.  Spectral clustering and the high-dimensional stochastic blockmodel , 2010, 1007.1684.

[37]  Arthur Gretton,et al.  Consistent Nonparametric Tests of Independence , 2010, J. Mach. Learn. Res..

[38]  Lav R. Varshney,et al.  Structural Properties of the Caenorhabditis elegans Neuronal Network , 2009, PLoS Comput. Biol..

[39]  Eric P. Xing,et al.  Network Completion and Survey Sampling , 2009, AISTATS.

[40]  Edward R. Scheinerman,et al.  Random Dot Product Graph Models for Social Networks , 2007, WAW.

[41]  Maria L. Rizzo,et al.  Measuring and testing dependence by correlation of distances , 2007, 0803.4101.

[42]  Edoardo M. Airoldi,et al.  Mixed Membership Stochastic Blockmodels , 2007, NIPS.

[43]  Yamir Moreno,et al.  Theory of Rumour Spreading in Complex Social Networks , 2007, ArXiv.

[44]  Mu Zhu,et al.  Automatic dimensionality selection from the scree plot via the use of profile likelihood , 2006, Comput. Stat. Data Anal..

[45]  Ann B. Lee,et al.  Diffusion maps and coarse-graining: a unified framework for dimensionality reduction, graph partitioning, and data set parameterization , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46]  Marcus Kaiser,et al.  Nonoptimal Component Placement, but Short Processing Paths, due to Long-Distance Projections in Neural Systems , 2006, PLoS Comput. Biol..

[47]  Robert L. Taylor,et al.  Laws of Large Numbers for Exchangeable Random Sets in Kuratowski-Mosco Sense , 2006 .

[48]  D. Chklovskii,et al.  Wiring optimization can relate neuronal structure and function. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[49]  Ann B. Lee,et al.  Geometric diffusions as a tool for harmonic analysis and structure definition of data: diffusion maps. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[50]  Raul Rodriguez-Esteban,et al.  Global optimization of cerebral cortex layout. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[51]  Béla Bollobás,et al.  Random Graphs, Second Edition , 2001, Cambridge Studies in Advanced Mathematics.

[52]  S. Wasserman,et al.  Logit models and logistic regressions for social networks: II. Multivariate relations. , 1999, The British journal of mathematical and statistical psychology.

[53]  S. Wasserman,et al.  Logit models and logistic regressions for social networks: I. An introduction to Markov graphs andp , 1996 .

[54]  Kathryn B. Laskey,et al.  Stochastic blockmodels: First steps , 1983 .

[55]  D. Freedman,et al.  Finite Exchangeable Sequences , 1980 .

[56]  N. Mantel The detection of disease clustering and a generalized regression approach. , 1967, Cancer research.

[57]  Emily Cox Pahnke,et al.  Understanding network formation in strategy research: Exponential random graph models , 2016 .

[58]  Maria L. Rizzo,et al.  Energy distance , 2016 .

[59]  Ursula Faber,et al.  Theory Of U Statistics , 2016 .

[60]  J. Rapoport,et al.  The anatomical distance of functional connections predicts brain network topology in health and schizophrenia. , 2013, Cerebral cortex.

[61]  Stéphane Lafon,et al.  Diffusion maps , 2006 .

[62]  K. Pearson VII. Note on regression and inheritance in the case of two parents , 1895, Proceedings of the Royal Society of London.

[63]  Supplementary for: Estimating and testing nonlinear local dependence between two time series , 2022 .