Tail-greedy bottom-up data decompositions and fast multiple change-point detection

This article proposes a ‘tail-greedy’, bottom-up transform for one-dimensional data, which results in a nonlinear but conditionally orthonormal, multiscale decomposition of the data with respect to an adaptively chosen Unbalanced Haar wavelet basis. The ‘tail-greediness’of the decomposition algorithm, whereby multiple greedy steps are taken in a single pass through the data, both enables fast computation and makes the algorithm applicable in the problem of consistent estimation of the number and locations of multiple changepoints in data. The resulting agglomerative change-point detection method avoids the disadvantages of the classical divisive binary segmentation, and offers very good practical performance. It is implemented in the R package breakfast, available from CRAN.

[1]  Yi-Ching Yao Estimating the number of change-points via Schwarz' criterion , 1988 .

[2]  I E Auger,et al.  Algorithms for the optimal identification of segment neighborhoods. , 1989, Bulletin of mathematical biology.

[3]  B. Brodsky,et al.  Nonparametric Methods in Change Point Problems , 1993 .

[4]  I. Johnstone,et al.  Ideal spatial adaptation by wavelet shrinkage , 1994 .

[5]  Chung-Bow Lee Estimating the number of change points in a sequence of independent normal random variables , 1995 .

[6]  Yazhen Wang Jump and sharp cusp detection by wavelets , 1995 .

[7]  J. Bai,et al.  Estimating Multiple Breaks One at a Time , 1997, Econometric Theory.

[8]  H. Müller,et al.  Statistical methods for DNA sequence segmentation , 1998 .

[9]  P. Perron,et al.  Computation and Analysis of Multiple Structural-Change Models , 1998 .

[10]  M. Lavielle Detection of multiple changes in a sequence of dependent variables , 1999 .

[11]  C. Inclan,et al.  Volatility in Emerging Stock Markets , 1997, Journal of Financial and Quantitative Analysis.

[12]  Gökhan Tür,et al.  Prosody-based automatic segmentation of speech into sentences and topics , 2000, Speech Commun..

[13]  Freddy Y. Y. Choi Advances in domain independent linear text segmentation , 2000, ANLP.

[14]  Pietro Liò,et al.  Wavelet change-point prediction of transmembrane proteins , 2000, Bioinform..

[15]  H. Müller,et al.  Multiple changepoint fitting via quasilikelihood, with application to DNA sequence segmentation , 2000 .

[16]  É. Moulines,et al.  Least‐squares Estimation of an Unknown Number of Shifts in a Time Series , 2000 .

[17]  P. Davies,et al.  Local Extremes, Runs, Strings and Multiresolution , 2001 .

[18]  Irena Koprinska,et al.  Temporal video segmentation: A survey , 2001, Signal Process. Image Commun..

[19]  Marie Husková,et al.  Permutation tests for multiple changes , 2001, Kybernetika.

[20]  P. Massart,et al.  Gaussian model selection , 2001 .

[21]  S. Soong,et al.  Clinical applications for change-point analysis of herpes zoster pain. , 2002, Journal of pain and symptom management.

[22]  Lie Lu,et al.  Content analysis for audio classification and segmentation , 2002, IEEE Trans. Speech Audio Process..

[23]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[24]  P. Chu,et al.  Bayesian Change-Point Analysis of Tropical Cyclone Activity: The Central North Pacific Case* , 2004 .

[25]  M. Wigler,et al.  Circular binary segmentation for the analysis of array-based DNA copy number data. , 2004, Biostatistics.

[26]  Kang G. Shin,et al.  Change-point monitoring for the detection of DoS attacks , 2004, IEEE Transactions on Dependable and Secure Computing.

[27]  Emilie Lebarbier,et al.  Detecting multiple change-points in the mean of Gaussian process by model selection , 2005, Signal Process..

[28]  Vladimir N. Minin,et al.  Dual multiple change-point model leads to more accurate recombination detection , 2005, Bioinform..

[29]  Marc Lavielle,et al.  Using penalized contrasts for the change-point problem , 2005, Signal Process..

[30]  R. Tibshirani,et al.  Sparsity and smoothness via the fused lasso , 2005 .

[31]  Jeffrey D. Scargle,et al.  An algorithm for optimal partitioning of data on an interval , 2003, IEEE Signal Processing Letters.

[32]  Hongjoong Kim,et al.  A novel approach to detection of intrusions in computer networks via adaptive sequential and batch-sequential change-point detection methods , 2006, IEEE Transactions on Signal Processing.

[33]  Richard A. Davis,et al.  Structural Break Estimation for Nonstationary Time Series Models , 2006 .

[34]  Jiahua Chen,et al.  Application of modified information criterion to multiple change point problems , 2006 .

[35]  Mahmoud A. Mahmoud,et al.  A change point method for linear profile data , 2007, Qual. Reliab. Eng. Int..

[36]  David O Siegmund,et al.  A Modified Bayes Information Criterion with Applications to the Analysis of Comparative Genomic Hybridization Data , 2007, Biometrics.

[37]  P. Fryzlewicz Unbalanced Haar Technique for Nonparametric Function Estimation , 2007 .

[38]  E. S. Venkatraman,et al.  A faster circular binary segmentation algorithm for the analysis of array CGH data , 2007, Bioinform..

[39]  Yuehua Wu Simultaneous change point analysis and variable selection in a regression problem , 2008 .

[40]  Chandra Erdman,et al.  A fast Bayesian change point analysis for the segmentation of microarray data , 2008, Bioinform..

[41]  A. Rinaldo Properties and refinements of the fused lasso , 2008, 0805.0234.

[42]  B. Silverman,et al.  Multiscale methods for data on graphs and irregular multidimensional situations , 2009 .

[43]  V. Liebscher,et al.  Consistencies and rates of convergence of jump-penalized least squares estimators , 2009, 0902.4838.

[44]  G. Rigaill A pruned dynamic programming algorithm to recover the best segmentations with 1 to K_max change-points. , 2010, 1004.0887.

[45]  Z. Harchaoui,et al.  Multiple Change-Point Estimation With a Total Variation Penalty , 2010 .

[46]  Piotr Fryzlewicz,et al.  Multiscale interpretation of taut string estimation and its connection to Unbalanced Haar wavelets , 2011, Stat. Comput..

[47]  Robert Lund,et al.  Mean shift testing in correlated data , 2011 .

[48]  C. Peota Novel approach. , 2011, Minnesota medicine.

[49]  Gabriela Ciuperca Model selection by LASSO methods in a change-point model , 2011 .

[50]  Gabriela Ciuperca A general criterion to determine the number of change-points , 2011 .

[51]  Kuo-mei Chen,et al.  Consistent multiple testing for change points , 2011, J. Multivar. Anal..

[52]  Piotr Fryzlewicz,et al.  Multiscale and multilevel technique for consistent segmentation of nonstationary time series , 2016, 1611.09727.

[53]  P. Fearnhead,et al.  Optimal detection of changepoints with a linear computational cost , 2011, 1101.1438.

[54]  Piotr Fryzlewicz,et al.  Adaptive trend estimation in financial time series via multiscale change-point-induced basis recovery , 2013 .

[55]  David S. Matteson,et al.  A Nonparametric Approach for Multiple Change Point Analysis of Multivariate Data , 2013, 1306.4933.

[56]  Piotr Fryzlewicz,et al.  Multiple‐change‐point detection for high dimensional time series via sparsified binary segmentation , 2015, 1611.08639.

[57]  P. Fryzlewicz,et al.  Multiple‐change‐point detection for auto‐regressive conditional heteroscedastic processes , 2014 .

[58]  A. Munk,et al.  FDR-Control in Multiscale Change-point Segmentation , 2014, 1412.5844.

[59]  Piotr Fryzlewicz,et al.  Wild binary segmentation for multiple change-point detection , 2014, 1411.0858.

[60]  B. Wahlberg,et al.  Submitted to the Annals of Statistics ON CHANGE POINT DETECTION USING THE FUSED LASSO METHOD ∗ By , 2014 .

[61]  S. Kou,et al.  Stepwise Signal Extraction via Marginal Likelihood , 2016, Journal of the American Statistical Association.

[62]  Yi-Ching Yao,et al.  LEAST-SQUARES ESTIMATION OF A STEP FUNCTION , 2016 .

[63]  Catherine Timmermans,et al.  SHAH: SHape-Adaptive Haar Wavelets for Image Processing , 2016 .

[64]  Paul Fearnhead,et al.  On optimal multiple changepoint algorithms for large data , 2014, Statistics and Computing.

[65]  Claudia Kirch,et al.  A MOSUM procedure for the estimation of multiple random change points , 2018 .

[66]  P. Fryzlewicz,et al.  Narrowest‐over‐threshold detection of multiple change points and change‐point‐like features , 2016, Journal of the Royal Statistical Society: Series B (Statistical Methodology).