Multiple Change-Point Detection via a Screening and Ranking Algorithm.

Let Y1, …, Yn be a sequence whose underlying mean is a step function with an unknown number of the steps and unknown change points. The detection of the change points, namely the positions where the mean changes, is an important problem in such fields as engineering, economics, climatology and bioscience. This problem has attracted a lot of attention in statistics, and a variety of solutions have been proposed and implemented. However, there is scant literature on the theoretical properties of those algorithms. Here, we investigate a recently developed algorithm called the Screening and Ranking Algorithm (SaRa). We characterize the theoretical properties of SaRa and show its superiority over other commonly used algorithms. In particular, we develop a false discovery rate approach to the multiple change-point problem and show a strong sure coverage property for the SaRa.

[1]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[2]  Arjun K. Gupta,et al.  Parametric Statistical Change Point Analysis , 2000 .

[3]  Xiaoming Huo,et al.  Near-optimal detection of geometric objects by fast multiscale methods , 2005, IEEE Transactions on Information Theory.

[4]  L. A. Gardner On Detecting Changes in the Mean of Normal Variates , 1969 .

[5]  Jon A. Wellner,et al.  Weak Convergence and Empirical Processes: With Applications to Statistics , 1996 .

[6]  K. Gunderson,et al.  High-resolution genomic profiling of chromosomal aberrations using Infinium whole-genome genotyping. , 2006, Genome research.

[7]  E. S. Page A test for a change in a parameter occurring at an unknown point , 1955 .

[8]  E. S. Page On problems in which a change in a parameter occurs at an unknown point , 1957 .

[9]  Hongzhe Li,et al.  Optimal Sparse Segment Identification With Application in Copy Number Variation Analysis , 2010, Journal of the American Statistical Association.

[10]  Heping Zhang,et al.  THE SCREENING AND RANKING ALGORITHM TO DETECT DNA COPY NUMBER VARIATIONS. , 2012, The annals of applied statistics.

[11]  Joseph T. Glessner,et al.  PennCNV: an integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data. , 2007, Genome research.

[12]  L. Horváth,et al.  Limit Theorems in Change-Point Analysis , 1997 .

[13]  R. Tibshirani,et al.  Spatial smoothing and hot spot detection for CGH data using the fused lasso. , 2008, Biostatistics.

[14]  M. Wigler,et al.  Circular binary segmentation for the analysis of array-based DNA copy number data. , 2004, Biostatistics.

[15]  David O Siegmund,et al.  A Modified Bayes Information Criterion with Applications to the Analysis of Comparative Genomic Hybridization Data , 2007, Biometrics.

[16]  Yi-Ching Yao Estimating the number of change-points via Schwarz' criterion , 1988 .

[17]  M. Hurles,et al.  Copy number variation in human health, disease, and evolution. , 2009, Annual review of genomics and human genetics.

[18]  T. Lai,et al.  Stochastic segmentation models for array-based comparative genomic hybridization data analysis. , 2008, Biostatistics.

[19]  Y. Benjamini,et al.  THE CONTROL OF THE FALSE DISCOVERY RATE IN MULTIPLE TESTING UNDER DEPENDENCY , 2001 .

[20]  Nancy R. Zhang DNA Copy Number Profiling in Normal and Tumor Genomes , 2010 .

[21]  Haipeng Xing,et al.  A SIMPLE BAYESIAN APPROACH TO MULTIPLE CHANGE-POINTS , 2011 .

[22]  Qiwei Yao,et al.  Tests for change-points with epidemic alternatives , 1993 .

[23]  H. Chernoff,et al.  ESTIMATING THE CURRENT MEAN OF A NORMAL DISTRIBUTION WHICH IS SUBJECTED TO CHANGES IN TIME , 1964 .

[24]  Bradley Efron,et al.  False discovery rates and copy number variation , 2011 .

[25]  Andrew Odlyzko,et al.  Large deviations of sums of independent random variables , 1988 .

[26]  Tao Huang,et al.  Detection of DNA copy number alterations using penalized least squares regression , 2005, Bioinform..

[27]  M. Srivastava,et al.  On Tests for Detecting Change in Mean , 1975 .