LOCALIZING DIFFERENTIALLY EVOLVING COVARIANCE STRUCTURES VIA SCAN STATISTICS.

Recent results in coupled or temporal graphical models offer schemes for estimating the relationship structure between features when the data come from related (but distinct) longitudinal sources. A novel application of these ideas is for analyzing group-level differences, i.e., in identifying if trends of estimated objects (e.g., covariance or precision matrices) are different across disparate conditions (e.g., gender or disease). Often, poor effect sizes make detecting the differential signal over the full set of features difficult: for example, dependencies between only a subset of features may manifest differently across groups. In this work, we first give a parametric model for estimating trends in the space of SPD matrices as a function of one or more covariates. We then generalize scan statistics to graph structures, to search over distinct subsets of features (graph partitions) whose temporal dependency structure may show statistically significant group-wise differences. We theoretically analyze the Family Wise Error Rate (FWER) and bounds on Type 1 and Type 2 error. Evaluating on US census data, we identify groups of states with cultural and legal overlap related to baby name trends and drug usage. On a cohort of individuals with risk factors for Alzheimer's disease (but otherwise cognitively healthy), we find scientifically interesting group differences where the default analysis, i.e., models estimated on the full graph, do not survive reasonable significance thresholds.

[1]  I. Holopainen Riemannian Geometry , 1927, Nature.

[2]  U. Grenander,et al.  Statistical analysis of stationary time series , 1957 .

[3]  M. Spivak A comprehensive introduction to differential geometry , 1979 .

[4]  H. Karcher Riemannian center of mass and mollifier smoothing , 1977 .

[5]  George A. F. Seber,et al.  Linear regression analysis , 1977 .

[6]  C. Roehrig,et al.  Conditions for Identification in Nonparametric and Parametic Models , 1988 .

[7]  E. Mammen,et al.  Comparing Nonparametric Versus Parametric Regression Fits , 1993 .

[8]  U. Grenander,et al.  Computational anatomy: an emerging discipline , 1998 .

[9]  S. Geer Applications of empirical process theory , 2000 .

[10]  P. Massart,et al.  Adaptive estimation of a quadratic functional by model selection , 2000 .

[11]  John M. Lee Introduction to Smooth Manifolds , 2002 .

[12]  Alan J. Lee,et al.  Linear Regression Analysis: Seber/Linear , 2003 .

[13]  M. Talagrand The Generic Chaining , 2005 .

[14]  Larry A. Wasserman,et al.  The Nonparanormal: Semiparametric Estimation of High Dimensional Undirected Graphs , 2009, J. Mach. Learn. Res..

[15]  Larry A. Wasserman,et al.  Time varying undirected graphs , 2008, Machine Learning.

[16]  Ming Yuan,et al.  High Dimensional Inverse Covariance Matrix Estimation via Linear Programming , 2010, J. Mach. Learn. Res..

[17]  Hongzhe Li,et al.  Optimal Sparse Segment Identification With Application in Copy Number Variation Analysis , 2010, Journal of the American Statistical Association.

[18]  G. Walther Optimal and fast detection of spatial clusters with scan statistics , 2010, 1002.4770.

[19]  Stefan Sperlich Semiparametric and Nonparametric Econometrics , 2011 .

[20]  T. Cai,et al.  A Constrained ℓ1 Minimization Approach to Sparse Precision Matrix Estimation , 2011, 1102.2233.

[21]  E. Candès,et al.  Detection of an anomalous cluster in a network , 2010, 1001.3209.

[22]  P. Thomas Fletcher,et al.  Geodesic Regression and the Theory of Least Squares on Riemannian Manifolds , 2012, International Journal of Computer Vision.

[23]  H. Zou,et al.  Regularized rank-based estimation of high-dimensional nonparanormal graphical models , 2012, 1302.3082.

[24]  Larry A. Wasserman,et al.  High Dimensional Semiparametric Gaussian Copula Graphical Models. , 2012, ICML 2012.

[25]  Stefan Sommer,et al.  Optimization over geodesics for exact principal geodesic analysis , 2010, Advances in Computational Mathematics.

[26]  Patrick Danaher,et al.  The joint graphical lasso for inverse covariance estimation across multiple classes , 2011, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[27]  Anuj Srivastava,et al.  Statistical analysis of trajectories on Riemannian manifolds: Bird migration, hurricane tracking and video surveillance , 2014, 1405.0803.

[28]  Peter Wonka,et al.  Fused Multiple Graphical Lasso , 2012, SIAM J. Optim..

[29]  Zhengwu Zhang,et al.  Rate-Invariant Analysis of Covariance Trajectories , 2018, Journal of Mathematical Imaging and Vision.