Optimal nonparametric change point analysis

We study change point detection and localization for univariate data in fully nonparametric settings, in which at each time point, we acquire an independent and identically distributed sample from an unknown distribution that is piecewise constant. The magnitude of the distributional changes at the change points is quantified using the Kolmogorov–Smirnov distance. Our framework allows all the relevant parameters, namely the minimal spacing between two consecutive change points, the minimal magnitude of the changes in the Kolmogorov–Smirnov distance, and the number of sample points collected at each time point, to change with the length of the time series. We propose a novel change point detection algorithm based on the Kolmogorov–Smirnov statistic and show that it is nearly minimax rate optimal. Our theory demonstrates a phase transition in the space of model parameters. The phase transition separates parameter combinations for which consistent localization is possible from the ones for which this task is statistically infeasible. We provide extensive numerical experiments to support our theory. MSC2020 subject classifications: Primary 62G05.

[1]  A. Rinaldo,et al.  Optimal change point detection and localization in sparse dynamic networks , 2018, The Annals of Statistics.

[2]  Nigel Collier,et al.  Change-Point Detection in Time-Series Data by Relative Density-Ratio Estimation , 2012, Neural Networks.

[3]  A. Rinaldo,et al.  Univariate mean change point detection: Penalization, CUSUM and optimality , 2018, Electronic Journal of Statistics.

[4]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[5]  Stefano Piana,et al.  Identifying localized changes in large systems: Change-point detection for biomolecular simulations , 2015, Proceedings of the National Academy of Sciences.

[6]  David S. Matteson,et al.  ecp: An R Package for Nonparametric Multiple Change Point Analysis of Multivariate Data , 2013, 1309.3295.

[7]  Alain Celisse,et al.  New efficient algorithms for multiple change-point detection with reproducing kernels , 2018, Comput. Stat. Data Anal..

[8]  Sylvain Arlot,et al.  A Kernel Multiple Change-point Algorithm via Model Selection , 2012, J. Mach. Learn. Res..

[9]  P. Fryzlewicz,et al.  Narrowest‐over‐threshold detection of multiple change points and change‐point‐like features , 2016, Journal of the Royal Statistical Society: Series B (Statistical Methodology).

[10]  N. Chan,et al.  Group LASSO for Structural Break Time Series , 2014 .

[11]  Yi-Ching Yao Estimating the number of change-points via Schwarz' criterion , 1988 .

[12]  Idris A. Eckley,et al.  changepoint: An R Package for Changepoint Analysis , 2014 .

[13]  Changliang Zou,et al.  Nonparametric maximum likelihood approach to multiple change-point problems , 2014, 1405.7173.

[14]  Kathryn Roeder,et al.  Global spectral clustering in dynamic networks , 2018, Proceedings of the National Academy of Sciences.

[15]  B. Boukai,et al.  Nonparametric estimation in a two change-point model , 1997 .

[16]  Piotr Fryzlewicz,et al.  Multiple‐change‐point detection for high dimensional time series via sparsified binary segmentation , 2015, 1611.08639.

[17]  B. Russell,et al.  Breaks and the statistical process of inflation: the case of estimating the ‘modern’ long-run Phillips curve , 2019 .

[18]  Jean-Philippe Vert,et al.  The group fused Lasso for multiple change-point detection , 2011, 1106.4199.

[19]  A. Athey,et al.  Spatially-Aware Temporal Anomaly Mapping of Gamma Spectra , 2014, IEEE Transactions on Nuclear Science.

[20]  Axel Munk,et al.  Seeded Binary Segmentation: A general methodology for fast and optimal change point detection , 2020 .

[21]  Alessandro Rinaldo,et al.  Optimal Covariance Change Point Detection in High Dimension , 2017 .

[22]  Laura Jula Vanegas,et al.  Multiscale quantile regression , 2019 .

[23]  Piotr Fryzlewicz,et al.  Detecting multiple generalized change-points by isolating single ones , 2019, Metrika.

[24]  R. Khan,et al.  Sequential Tests of Statistical Hypotheses. , 1972 .

[25]  Douglas D. O'Shaughnessy,et al.  Bayesian on-line spectral change point detection: a soft computing approach for on-line ASR , 2011, International Journal of Speech Technology.

[26]  Richard A. Davis,et al.  The Asymptotic behavior of the Likelihood Ratio Statistic for Testing a Shift in Mean in a Sequence of Independent Normal Variates. , 1984 .

[27]  Michael I. Jordan,et al.  A Sticky HDP-HMM With Application to Speaker Diarization , 2009, 0905.2592.

[28]  B. Darkhovski Nonparametric methods in change-point problems: a general approach and some concrete algorithms , 1994 .

[29]  Maria L. Rizzo,et al.  DISCO analysis: A nonparametric extension of analysis of variance , 2010, 1011.2288.

[30]  Paul Fearnhead,et al.  Changepoint Detection in the Presence of Outliers , 2016, Journal of the American Statistical Association.

[31]  Douglas M. Hawkins,et al.  A Nonparametric Change-Point Control Chart , 2010 .

[32]  Bin Yu Assouad, Fano, and Le Cam , 1997 .

[33]  Le Song,et al.  Scan B-statistic for kernel change-point detection , 2015, Sequential Analysis.

[34]  Axel Munk,et al.  Heterogeneous change point inference , 2015, 1505.04898.

[35]  David S. Matteson,et al.  A Nonparametric Approach for Multiple Change Point Analysis of Multivariate Data , 2013, 1306.4933.

[36]  Guillem Rigaill,et al.  Pruned dynamic programming for optimal multiple change-point detection , 2010 .

[37]  Sylvain Arlot,et al.  Consistent change-point detection with kernels , 2016, 1612.04740.

[38]  P. Fearnhead,et al.  Optimal detection of changepoints with a linear computational cost , 2011, 1101.1438.

[39]  Paul Fearnhead,et al.  A computationally efficient nonparametric approach for changepoint detection , 2016, Statistics and Computing.

[40]  A. Munk,et al.  Multiscale change point inference , 2013, 1301.7212.

[41]  Claudia Kirch,et al.  A MOSUM procedure for the estimation of multiple random change points , 2018 .

[42]  Paul Fearnhead,et al.  Fast nonconvex deconvolution of calcium imaging data. , 2018, Biostatistics.

[43]  Tengyao Wang,et al.  High dimensional change point estimation via sparse projection , 2016, 1606.06246.

[44]  James G. Scott,et al.  Sequential Nonparametric Tests for a Change in Distribution: An Application to Detecting Radiological Anomalies , 2016 .

[45]  P. Perron,et al.  Computation and Analysis of Multiple Structural-Change Models , 1998 .

[46]  H. Dette,et al.  Detection of Multiple Structural Breaks in Multivariate Time Series , 2013, 1309.1309.

[47]  Piotr Fryzlewicz,et al.  Wild binary segmentation for multiple change-point detection , 2014, 1411.0858.

[48]  Yi Yu,et al.  Estimating whole‐brain dynamics by using spectral clustering , 2015, 1509.03730.

[49]  E. Carlstein Nonparametric Change-Point Estimation , 1988 .

[50]  Alexandre B. Tsybakov,et al.  Introduction to Nonparametric Estimation , 2008, Springer series in statistics.

[51]  N. Itoh Change-P oint Detection of Climate Time Series by Nonparametric Method , 2010 .

[52]  Achim Zeileis,et al.  Strucchange: An R package for testing for structural change in linear regression models , 2002 .

[53]  O. Cappé,et al.  Retrospective Mutiple Change-Point Estimation with Kernels , 2007, 2007 IEEE/SP 14th Workshop on Statistical Signal Processing.

[54]  Haeran Cho,et al.  Change-point detection in panel data via double CUSUM statistic , 2016, 1611.08631.

[55]  Ton Steerneman,et al.  ON THE TOTAL VARIATION AND HELLINGER DISTANCE BETWEEN SIGNED MEASURES - AN APPLICATION TO PRODUCT MEASURES , 1983 .