Inference on the Change Point under a High Dimensional Covariance Shift

We consider the problem of constructing asymptotically valid confidence intervals for the change point in a high-dimensional covariance shift setting. A novel estimator for the change point parameter is developed, and its asymptotic distribution under high dimensional scaling obtained. We establish that the proposed estimator exhibits a sharp Op(ψ −2) rate of convergence, wherein ψ represents the jump size between model parameters before and after the change point. Further, the form of the asymptotic distributions under both a vanishing and a non-vanishing regime of the jump size are characterized. In the former case, it corresponds to the argmax of an asymmetric Brownian motion, while in the latter case to the argmax of an asymmetric random walk. We then obtain the relationship between these distributions, which allows construction of regime (vanishing vs non-vanishing) adaptive confidence intervals. Easy to implement algorithms for the proposed methodology are developed and their performance illustrated on synthetic and real data sets.

[1]  P. Fryzlewicz,et al.  Detection of Multiple Structural Breaks in Large Covariance Matrices , 2022, Journal of Business & Economic Statistics.

[2]  Hankui Peng,et al.  Subspace Change-Point Detection via Low-Rank Matrix Factorisation , 2021, ArXiv.

[3]  George Michailidis,et al.  Inference for Change Points in High Dimensional Mean Shift Models , 2021, Statistica Sinica.

[4]  Claudia Kirch,et al.  Bootstrap confidence intervals for multiple change points based on moving sum procedures , 2021, Comput. Stat. Data Anal..

[5]  Alessandro Rinaldo,et al.  Localizing Changes in High-Dimensional Regression Models , 2020, AISTATS.

[6]  George Michailidis,et al.  Online detection of local abrupt changes in high-dimensional Gaussian graphical models , 2020, ArXiv.

[7]  E. Svoboda Could the gut microbiome be linked to autism? , 2020, Nature.

[8]  R. Willett,et al.  Statistically and Computationally Efficient Change Point Localization in Regression Settings , 2019, J. Mach. Learn. Res..

[9]  Changbo Zhu,et al.  Inference for change points in high-dimensional data via selfnormalization , 2019, The Annals of Statistics.

[10]  G. Michailidis,et al.  Change Point Estimation in Panel Data with Temporal and Cross-sectional Dependence , 2019, 1904.11101.

[11]  James A. Foster,et al.  Household composition and the infant fecal microbiome: The INSPIRE study. , 2019, American journal of physical anthropology.

[12]  P. Tripathi,et al.  Gut microbiome and type 2 diabetes: where we are and where to go? , 2019, The Journal of nutritional biochemistry.

[13]  George Michailidis,et al.  Change Point Estimation in a Dynamic Stochastic Block Model , 2018, J. Mach. Learn. Res..

[14]  George Michailidis,et al.  Sequential change-point detection in high-dimensional Gaussian graphical models , 2018, J. Mach. Learn. Res..

[15]  Venkata K. Jandhyala,et al.  An Efficient Two Step Algorithm for High Dimensional Change Point Regression Models Without Grid Search , 2018, J. Mach. Learn. Res..

[16]  Alessandro Rinaldo,et al.  Optimal covariance change point localization in high dimensions , 2017, Bernoulli.

[17]  Alex J. Gibberd,et al.  Multiple Changepoint Estimation in High-Dimensional Gaussian Graphical Models , 2017, 1712.05786.

[18]  Shyamal D. Peddada,et al.  Analysis of Microbiome Data in the Presence of Excess Zeros , 2017, Front. Microbiol..

[19]  G. Michailidis,et al.  Common change point estimation in panel data from the least squares and maximum likelihood viewpoints , 2017, 1708.05836.

[20]  Yves Atchadé,et al.  Change-Point Computation for Large Graphical Models: A Scalable Algorithm for Gaussian Graphical Models with Change-Points , 2017, J. Mach. Learn. Res..

[21]  S. Peddada,et al.  Structural zeros in high‐dimensional data with applications to microbiome studies , 2017, Biostatistics.

[22]  George Michailidis,et al.  Sparse network modeling and metscape‐based visualization methods for the analysis of large‐scale metabolomics data , 2017, Bioinform..

[23]  Piotr Fryzlewicz,et al.  Simultaneous multiple change-point and factor analysis for high-dimensional time series , 2016, Journal of Econometrics.

[24]  Valeriy Avanesov,et al.  Change-point detection in high-dimensional covariance structure , 2016, 1610.03783.

[25]  Tengyao Wang,et al.  High dimensional change point estimation via sparse projection , 2016, 1606.06246.

[26]  Han Liu,et al.  A General Theory of Hypothesis Tests and Confidence Regions for Sparse High Dimensional Models , 2014, 1412.8765.

[27]  Piotr Fryzlewicz,et al.  Wild binary segmentation for multiple change-point detection , 2014, 1411.0858.

[28]  Christine Sinoquet,et al.  Probabilistic graphical models for genetics, genomics and postgenomics , 2014 .

[29]  George Michailidis,et al.  Change point estimation in high dimensional Markov random‐field models , 2014, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[30]  R. Knight,et al.  Meta-analyses of studies of the human microbiota , 2013, Genome research.

[31]  S. Geer,et al.  On asymptotically optimal confidence regions and tests for high-dimensional models , 2013, 1303.0518.

[32]  Myung Hwan Seo,et al.  The lasso for high dimensional regression with a possible change point , 2012, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[33]  Martin A. Lindquist,et al.  Dynamic connectivity regression: Determining state-related changes in brain connectivity , 2012, NeuroImage.

[34]  A. Belloni,et al.  Inference on Treatment Effects after Selection Amongst High-Dimensional Controls , 2011, 1201.0224.

[35]  Po-Ling Loh,et al.  High-dimensional regression with noisy and missing data: Provable guarantees with non-convexity , 2011, NIPS.

[36]  Bodhisattva Sen,et al.  A continuous mapping theorem for the smallest argmax functional , 2011, 1105.1320.

[37]  Mladen Kolar,et al.  Estimating networks with jumps. , 2010, Electronic journal of statistics.

[38]  A. Belloni,et al.  Square-Root Lasso: Pivotal Recovery of Sparse Signals via Conic Programming , 2010, 1009.5689.

[39]  Ming Yuan,et al.  High Dimensional Inverse Covariance Matrix Estimation via Linear Programming , 2010, J. Mach. Learn. Res..

[40]  A. Aue,et al.  Break detection in the covariance structure of multivariate time series models , 2009, 0911.3796.

[41]  George Michailidis,et al.  Change point estimation under adaptive sampling , 2009, 0908.1838.

[42]  Le Song,et al.  Estimating time-varying networks , 2008, ISMB 2008.

[43]  V. Koltchinskii,et al.  High Dimensional Probability , 2006, math/0612726.

[44]  N. Meinshausen,et al.  High-dimensional graphs and variable selection with the Lasso , 2006, math/0608017.

[45]  Holger Dette,et al.  A note on testing the covariance matrix for large dimension , 2005 .

[46]  I. Johnstone On the distribution of the largest eigenvalue in principal components analysis , 2001 .

[47]  J. Bai,et al.  Estimation of a Change Point in Multiple Regression Models , 1997, Review of Economics and Statistics.

[48]  Jon A. Wellner,et al.  Weak Convergence and Empirical Processes: With Applications to Statistics , 1996 .

[49]  J. Bai,et al.  Least squares estimation of a shift in linear processes , 1994 .

[50]  John Aitchison,et al.  The Statistical Analysis of Compositional Data , 1986 .

[51]  R. Hogg,et al.  On adaptive estimation , 1984 .

[52]  A. Rényi,et al.  Generalization of an inequality of Kolmogorov , 1955 .

[53]  Venkata K. Jandhyala,et al.  Inference on the change point under a high dimensional sparse mean shift , 2021 .

[54]  George Michailidis,et al.  Multiple Change Points Detection in Low Rank and Sparse High Dimensional Vector Autoregressive Models , 2020, IEEE Transactions on Signal Processing.

[55]  Jianqing Fan,et al.  High-Dimensional Statistics , 2014 .

[56]  Noël Veraverbeke,et al.  Change-point problem and bootstrap , 1995 .

[57]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.