Bayesian detection of abnormal segments in multiple time series

We present a novel Bayesian approach to analysing multiple time-series with the aim of detecting abnormal regions. These are regions where the properties of the data change from some normal or baseline behaviour. We allow for the possibility that such changes will only be present in a, potentially small, subset of the time-series. We develop a general model for this problem, and show how it is possible to accurately and efficiently perform Bayesian inference, based upon recursions that enable independent sampling from the posterior distribution. A motivating application for this problem comes from detecting copy number variation (CNVs), using data from multiple individuals. Pooling information across individuals can increase the power of detecting CNVs, but often a specific CNV will only be present in a small subset of the individuals. We evaluate the Bayesian method on both simulated and real CNV data, and give evidence that this approach is more accurate than a recently proposed method for analysing such data.

[1]  Christopher Yau,et al.  A decision theoretic approach for segmental classification using Hidden Markov models , 2010 .

[2]  S. Fotopoulos,et al.  Inference for single and multiple change‐points in time series , 2013 .

[3]  R. Tsay,et al.  Outlier Detection in Multivariate Time Series by Projection Pursuit , 2006 .

[4]  A. Munk,et al.  Multiscale change point inference , 2013, 1301.7212.

[5]  Haavard Rue,et al.  Approximate simulation-free Bayesian inference for multiple changepoint models with dependence within segments , 2010, 1011.5038.

[6]  Nancy R. Zhang,et al.  Detecting simultaneous variant intervals in aligned sequences , 2011, 1108.3177.

[7]  Hongzhe Li,et al.  Simultaneous Discovery of Rare and Common Segment Variants. , 2013, Biometrika.

[8]  Nancy R. Zhang DNA Copy Number Profiling in Normal and Tumor Genomes , 2010 .

[9]  Paul Fearnhead,et al.  Exact and efficient Bayesian inference for multiple changepoint problems , 2006, Stat. Comput..

[10]  Salim Hariri,et al.  Multivariate statistical analysis for network attacks detection , 2005, The 3rd ACS/IEEE International Conference onComputer Systems and Applications, 2005..

[11]  J. Berger Statistical Decision Theory and Bayesian Analysis , 1988 .

[12]  Ryan Mills,et al.  Comprehensive assessment of array-based platforms and calling algorithms for detection of copy number variants , 2011, Nature Biotechnology.

[13]  P. Fearnhead,et al.  On‐line inference for multiple changepoint problems , 2007 .

[14]  Paul Fearnhead,et al.  Bayesian Analysis of Isochores , 2009 .

[15]  Nancy R. Zhang,et al.  Detecting simultaneous changepoints in multiple sequences. , 2010, Biometrika.

[16]  George Casella,et al.  Implementations of the Monte Carlo EM Algorithm , 2001 .

[17]  M. Wigler,et al.  Circular binary segmentation for the analysis of array-based DNA copy number data. , 2004, Biostatistics.

[18]  Sahin Albayrak,et al.  Pattern recognition and classification for multivariate time series , 2011, SensorKDD '11.

[19]  J. Hartigan,et al.  Product Partition Models for Change Point Problems , 1992 .

[20]  R. Tsay,et al.  Outliers in multivariate time series , 2000 .

[21]  Vidyadhar G. Kulkarni,et al.  Introduction to modeling and analysis of stochastic systems , 2011 .

[22]  Jiashun Jin,et al.  Detecting a target in very noisy data from multiple looks , 2004 .