Optimal covariance change point localization in high dimensions

We study the problem of change point detection for covariance matrices in high dimensions. We assume that we observe a sequence {X_i}_{i=1,...,n} of independent and centered p-dimensional sub-Gaussian random vectors whose covariance matrices are piecewise constant. Our task is to recover with high accuracy the number and locations of the change points, which are assumed unknown. Our generic model setting allows for all the model parameters to change with n, including the dimension p, the minimal spacing between consecutive change points, the magnitude of smallest change size and the maximal Orlicz- 2 norm of the covariance matrices of the sample points. Without assuming any additional structural assumption, such as low rank matrices or having sparse principle components, we set up a general framework and a benchmark result for the covariance change point detection problem. We introduce two procedures, one based on the binary segmentation algorithm (e.g. Vostrikova, 1981) and the other on its extension known as wild binary segmentation of Fryzlewicz (2014), and demonstrate that, under suitable conditions, both procedures are able to consistently es- timate the number and locations of change points. Our second algorithm, called Wild Binary Segmentation through Independent Projection (WBSIP), is shown to be optimal in the sense of allowing for the minimax scaling in all the relevant parameters. Our minimax analysis reveals a phase transition effect based on the problem of change point localization. To the best of our knowledge, this type of results has not been established elsewhere in the high-dimensional change point detection literature.

[1]  Junyang Qian,et al.  On pattern recovery of the fused Lasso , 2012, 1211.5194.

[2]  Piotr Fryzlewicz,et al.  Simultaneous multiple change-point and factor analysis for high-dimensional time series , 2016, Journal of Econometrics.

[3]  Jun Yu Li,et al.  Two Sample Tests for High Dimensional Covariance Matrices , 2012, 1206.0917.

[4]  Roman Vershynin,et al.  Introduction to the non-asymptotic analysis of random matrices , 2010, Compressed Sensing.

[5]  Valeriy Avanesov,et al.  Change-point detection in high-dimensional covariance structure , 2016, 1610.03783.

[6]  Piotr Fryzlewicz,et al.  Wild binary segmentation for multiple change-point detection , 2014, 1411.0858.

[7]  E. S. Page CONTINUOUS INSPECTION SCHEMES , 1954 .

[8]  Efficiency of change point tests in high dimensional settings , 2014, 1409.1771.

[9]  M. Wigler,et al.  Circular binary segmentation for the analysis of array-based DNA copy number data. , 2004, Biostatistics.

[10]  Piotr Fryzlewicz,et al.  Multiscale and multilevel technique for consistent segmentation of nonstationary time series , 2016, 1611.09727.

[11]  I. Johnstone On the distribution of the largest eigenvalue in principal components analysis , 2001 .

[12]  Alessandro Rinaldo,et al.  A Sharp Error Analysis for the Fused Lasso, with Application to Approximate Changepoint Screening , 2017, NIPS.

[13]  P. Fearnhead,et al.  Optimal detection of changepoints with a linear computational cost , 2011, 1101.1438.

[14]  Richard A. Davis,et al.  Structural Break Estimation for Nonstationary Time Series Models , 2006 .

[15]  M. Lavielle Detection of multiple changes in a sequence of dependent variables , 1999 .

[16]  H. Chan,et al.  Detection with the scan and the average likelihood ratio , 2011, 1107.4344.

[17]  P. Fryzlewicz,et al.  Narrowest‐over‐threshold detection of multiple change points and change‐point‐like features , 2016, Journal of the Royal Statistical Society: Series B (Statistical Methodology).

[18]  W. Wu,et al.  Nonlinear system theory: another look at dependence. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[19]  D. Siegmund,et al.  Tests for a change-point , 1987 .

[20]  Yazhen Wang Jump and sharp cusp detection by wavelets , 1995 .

[21]  Piotr Fryzlewicz,et al.  Multiple‐change‐point detection for high dimensional time series via sparsified binary segmentation , 2015, 1611.08639.

[22]  V. Liebscher,et al.  Consistencies and rates of convergence of jump-penalized least squares estimators , 2009, 0902.4838.

[23]  G. Pan,et al.  Estimating a Change Point in a Sequence of Very High-Dimensional Covariance Matrices , 2018, Journal of the American Statistical Association.

[24]  A. Munk,et al.  Multiscale change-point segmentation: beyond step functions , 2017, Electronic Journal of Statistics.

[25]  Piotr Kokoszka,et al.  Detecting changes in the mean of functional observations , 2009 .

[26]  R. Khan,et al.  Sequential Tests of Statistical Hypotheses. , 1972 .

[27]  Anja Vogler,et al.  An Introduction to Multivariate Statistical Analysis , 2004 .

[28]  Holger Dette,et al.  A note on testing the covariance matrix for large dimension , 2005 .

[29]  M. Jirak Uniform change point tests in high dimension , 2015, 1511.05333.

[30]  L. Horváth,et al.  Change‐point detection in panel data , 2012 .

[31]  P. Rigollet,et al.  Optimal detection of sparse principal components in high dimension , 2012, 1202.5070.

[32]  P. Davies,et al.  Local Extremes, Runs, Strings and Multiresolution , 2001 .

[33]  A. Munk,et al.  Multiscale change point inference , 2013, 1301.7212.

[34]  B. Wahlberg,et al.  Submitted to the Annals of Statistics ON CHANGE POINT DETECTION USING THE FUSED LASSO METHOD ∗ By , 2014 .

[35]  R. Tibshirani,et al.  Sparsity and smoothness via the fused lasso , 2005 .

[36]  T. Cai,et al.  Two-Sample Covariance Matrix Testing and Support Recovery in High-Dimensional and Sparse Settings , 2013 .

[37]  T. Cai,et al.  Optimal hypothesis testing for high dimensional covariance matrices , 2012, 1205.4219.

[38]  G. C. Tiao,et al.  Use of Cumulative Sums of Squares for Retrospective Detection of Changes of Variance , 1994 .

[39]  É. Moulines,et al.  Least‐squares Estimation of an Unknown Number of Shifts in a Time Series , 2000 .

[40]  George E. P. Box,et al.  Some Aspects of Multivariate Analysis , 2011 .

[41]  Z. Harchaoui,et al.  Multiple Change-Point Estimation With a Total Variation Penalty , 2010 .

[42]  C. Parvin An Introduction to Multivariate Statistical Analysis, 3rd ed. T.W. Anderson. Hoboken, NJ: John Wiley & Sons, 2003, 742 pp., $99.95, hardcover. ISBN 0-471-36091-0. , 2004 .

[43]  D. Siegmund,et al.  Using the Generalized Likelihood Ratio Statistic for Sequential Detection of a Change-Point , 1995 .

[44]  Karolos K. Korkas,et al.  MULTIPLE CHANGE-POINT DETECTION FOR NON-STATIONARY TIME SERIES USING WILD BINARY SEGMENTATION , 2017 .

[45]  D. Picard Testing and estimating change-points in time series , 1985, Advances in Applied Probability.

[46]  Edit Gombay,et al.  ESTIMATORS AND TESTS FOR CHANGE IN VARIANCES , 1996 .

[47]  Agnès Sulem,et al.  Statistics and Risk Modeling , 2014 .

[48]  Haeran Cho,et al.  Change-point detection in panel data via double CUSUM statistic , 2016, 1611.08631.

[49]  Tengyao Wang,et al.  High dimensional change point estimation via sparse projection , 2016, 1606.06246.