Bayesian Model for Multiple Change-Points Detection in Multivariate Time Series

This paper addresses the issue of detecting change-points in time series. The proposed model, called the Bernoulli Detector, is presented first in a univariate context. This approach differs from existing counterparts by making only assumptions on the nature of the change-points, and does not depend on hypothesis on the distribution of the data, contrary to the parametric methods. It relies on the combination of a local robust statistical test, based on the computation of ranks and acting on individual time segments, with a global Bayesian framework able to optimize the change-points configurations from multiple local statistics, provided as $p$-values. The control of the detection of a single change-point is proved even for small samples. The interest of such a generalizable nonparametric approach is shown on simulated data by the good performances attained for Gaussian noise as well as in presence of outliers, without adapting the model. The model is extended to the multivariate case by introducing the probabilities that the change-points affect simultaneously several time series. The method presents then the advantage to detect both unique and shared change-points for each signal. We finally illustrate our algorithm with real datasets from energy monitoring and genomic. Segmentations are compared to state-of-the-art approaches like the group lasso and the BARD algorithm.

[1]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[2]  U. Paquet Empirical Bayesian Change Point Detection , 2007 .

[3]  Paul Fearnhead,et al.  Bayesian detection of abnormal segments in multiple time series , 2014 .

[4]  O. J. Dunn Multiple Comparisons among Means , 1961 .

[5]  C. Varin On composite marginal likelihoods , 2008 .

[6]  Z. Harchaoui,et al.  Multiple Change-Point Estimation With a Total Variation Penalty , 2010 .

[7]  David S. Matteson,et al.  ecp: An R Package for Nonparametric Multiple Change Point Analysis of Multivariate Data , 2013, 1309.3295.

[8]  David S. Matteson,et al.  A Nonparametric Approach for Multiple Change Point Analysis of Multivariate Data , 2013, 1306.4933.

[9]  Paul Fearnhead,et al.  Exact and efficient Bayesian inference for multiple changepoint problems , 2006, Stat. Comput..

[10]  Douglas M. Hawkins,et al.  Detection of multiple change-points in multivariate data , 2013 .

[11]  Heping Zhang,et al.  Multiple Change-Point Detection via a Screening and Ranking Algorithm. , 2013, Statistica Sinica.

[12]  Changliang Zou,et al.  Empirical likelihood ratio test for the change-point problem , 2007 .

[13]  D. Hawkins Fitting multiple change-point models to data , 2001 .

[14]  P. Perron,et al.  Computation and Analysis of Multiple Structural-Change Models , 1998 .

[15]  Jean-Philippe Vert,et al.  The group fused Lasso for multiple change-point detection , 2011, 1106.4199.

[16]  Aleksandr Y. Aravkin,et al.  Sparse/robust estimation and Kalman smoothing with nonsmooth log-concave densities: modeling, computation, and theory , 2013, J. Mach. Learn. Res..

[17]  David B. Allison,et al.  A mixture model approach for the analysis of microarray gene expression data , 2002 .

[18]  B. E. Brodsky,et al.  Non-Parametric Statistical Diagnosis , 2000 .

[19]  Marie Schmidt,et al.  Nonparametrics Statistical Methods Based On Ranks , 2016 .

[20]  Vito M. R. Muggeo,et al.  Efficient change point detection for genomic sequences of continuous measurements , 2011, Bioinform..

[21]  Jean-Philippe Vert,et al.  Fast detection of multiple change-points shared by many signals using group LARS , 2010, NIPS.

[22]  Alexandre Lung-Yut-Fong,et al.  Robust retrospective multiple change-point estimation for multivariate data , 2011, 2011 IEEE Statistical Signal Processing Workshop (SSP).

[23]  M. Lavielle,et al.  Detection of multiple change-points in multivariate time series , 2006 .

[24]  N. Hengartner,et al.  Structural learning with time‐varying components: tracking the cross‐section of financial time series , 2005 .

[25]  Olivier Capp'e,et al.  Homogeneity and change-point detection tests for multivariate data using rank statistics , 2011, 1107.1971.

[26]  R. Tibshirani,et al.  The solution path of the generalized lasso , 2010, 1005.1971.

[27]  P. Massart,et al.  Gaussian model selection , 2001 .

[28]  S. Siegel,et al.  Nonparametric Statistics for the Behavioral Sciences , 2022, The SAGE Encyclopedia of Research Design.

[29]  Zaïd Harchaoui,et al.  A regularized kernel-based approach to unsupervised audio segmentation , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[30]  D. Pinkel,et al.  Regional copy number–independent deregulation of transcription in cancer , 2006, Nature Genetics.

[31]  Laura Ventura,et al.  Bayesian composite marginal likelihoods , 2011 .

[32]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[33]  Manuel Davy,et al.  An online kernel change detection algorithm , 2005, IEEE Transactions on Signal Processing.

[34]  Stéphane Robin,et al.  Joint segmentation, calling, and normalization of multiple CGH profiles. , 2011, Biostatistics.

[35]  Fredrik Gustafsson,et al.  Adaptive filtering and change detection , 2000 .

[36]  Alfred O. Hero,et al.  Unsupervised Bayesian analysis of gene expression patterns , 2010, 2010 Conference Record of the Forty Fourth Asilomar Conference on Signals, Systems and Computers.

[37]  B. E. Brodsky,et al.  Non-Parametric Statistical Diagnosis: Problems and Methods , 2000 .

[38]  S. Chen,et al.  Speaker, Environment and Channel Change Detection and Clustering via the Bayesian Information Criterion , 1998 .

[39]  N. Reid,et al.  AN OVERVIEW OF COMPOSITE LIKELIHOOD METHODS , 2011 .

[40]  Edward H Ip,et al.  Behaviour of the Gibbs sampler when conditional distributions are potentially incompatible , 2015, Journal of statistical computation and simulation.

[41]  Michèle Basseville,et al.  Detection of abrupt changes: theory and application , 1993 .

[42]  Paul H. C. Eilers,et al.  Quantile smoothing of array CGH data , 2005, Bioinform..

[43]  Y. Benjamini,et al.  THE CONTROL OF THE FALSE DISCOVERY RATE IN MULTIPLE TESTING UNDER DEPENDENCY , 2001 .

[44]  Florent Chatelain,et al.  Rank-based multiple change-point detection in multivariate time series , 2014, 2014 22nd European Signal Processing Conference (EUSIPCO).

[45]  H. B. Mann,et al.  On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other , 1947 .

[46]  M. J. Bayarri,et al.  Calibration of ρ Values for Testing Precise Null Hypotheses , 2001 .

[47]  R. Tibshirani,et al.  A fused lasso latent feature model for analyzing multi-sample aCGH data. , 2011, Biostatistics.

[48]  Zaïd Harchaoui,et al.  Signal Processing , 2013, 2020 27th International Conference on Mixed Design of Integrated Circuits and System (MIXDES).

[49]  A. Davison,et al.  Bayesian Inference from Composite Likelihoods, with an Application to Spatial Extremes , 2009, 0911.5357.

[50]  E. Samuel-Cahn,et al.  P Values as Random Variables—Expected P Values , 1999 .

[51]  Nikos D. Sidiropoulos,et al.  Sparse Parametric Models for Robust Nonstationary Signal Analysis: Leveraging the Power of Sparse Regression , 2013, IEEE Signal Processing Magazine.

[52]  J. Monahan,et al.  Proper likelihoods for Bayesian analysis , 1992 .

[53]  Caren Marzban,et al.  Using labeled data to evaluate change detectors in a multivariate streaming environment , 2009, Signal Process..

[54]  Jean-Yves Tourneret,et al.  Joint Segmentation of Piecewise Constant Autoregressive Processes by Using a Hierarchical Model and a Bayesian Sampling Approach , 2006, IEEE Transactions on Signal Processing.

[55]  James A. Hanley,et al.  Normal Approximations to the Distributions of the Wilcoxon Statistics: Accurate to What N? Graphical Insights , 2010 .

[56]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .