A robust aCGH data recovery framework based on half quadratic minimization

This paper presents a general half quadratic framework for simultaneous analysis of the whole array comparative genomic hybridization (aCGH) profiles in a data set. The proposed framework accommodates different M-estimation loss functions and two underlying assumptions for aCGH profiles of a data set: sparsity and low rank. Using M-estimation loss functions, this framework is more robust to various types of noise and outliers. The solution of the proposed framework is given by half quadratic (HQ) minimization. To hasten this procedure, accelerated proximal gradient (APG) is utilized. Experimental results support the robustness of the proposed framework in comparison to the state-of-the-art algorithms.

[1]  Hongyu Zhao,et al.  Multisample aCGH Data Analysis via Total Variation and Spectral Regularization , 2013, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[2]  R. Tibshirani,et al.  Spatial smoothing and hot spot detection for CGH data using the fused lasso. , 2008, Biostatistics.

[3]  L. Feuk,et al.  Structural variation in the human genome , 2006, Nature Reviews Genetics.

[4]  Majid Mohammadi,et al.  Mat-aCGH: a Matlab toolbox for simultaneous multisample aCGH data analysis and visualization , 2015 .

[5]  M. Mohammadi,et al.  Robust and Stable Gene Selection via Maximum-Minimum Correntropy Criterion , 2015, bioRxiv.

[6]  Jose C. Principe,et al.  Information Theoretic Learning - Renyi's Entropy and Kernel Perspectives , 2010, Information Theoretic Learning.

[7]  George E. Liu,et al.  A Genome-Wide Analysis of Array-Based Comparative Genomic Hybridization (CGH) Data to Detect Intra-Species Variations and Evolutionary Relationships , 2009, PloS one.

[8]  Donald Geman,et al.  Nonlinear image recovery with half-quadratic regularization , 1995, IEEE Trans. Image Process..

[9]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[10]  Marcus Hutter,et al.  Bayesian DNA copy number analysis , 2009, BMC Bioinformatics.

[11]  Liang Wang,et al.  Robust Recognition via Information Theoretic Learning , 2014, SpringerBriefs in Computer Science.

[12]  Jiming Liu,et al.  Piecewise-constant and low-rank approximation for identification of recurrent copy number variations , 2014, Bioinform..

[13]  Tieniu Tan,et al.  Half-Quadratic-Based Iterative Minimization for Robust Sparse Representation , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Mila Nikolova,et al.  Analysis of Half-Quadratic Minimization Methods for Signal and Image Recovery , 2005, SIAM J. Sci. Comput..

[15]  Tieniu Tan,et al.  Robust Recovery of Corrupted Low-RankMatrix by Implicit Regularizers , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Weifeng Liu,et al.  Correntropy: Properties and Applications in Non-Gaussian Signal Processing , 2007, IEEE Transactions on Signal Processing.

[17]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[18]  Tieniu Tan,et al.  l2, 1 Regularized correntropy for robust feature selection , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Ran He,et al.  Maximum Correntropy Criterion for Robust Face Recognition , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  R. Tibshirani,et al.  A fused lasso latent feature model for analyzing multi-sample aCGH data. , 2011, Biostatistics.

[21]  Jean-Philippe Vert,et al.  The group fused Lasso for multiple change-point detection , 2011, 1106.4199.

[22]  Kevin P. Murphy,et al.  Integrating copy number polymorphisms into array CGH analysis using a robust HMM , 2006, ISMB.

[23]  Yi Ma,et al.  Robust principal component analysis? , 2009, JACM.

[24]  José Carlos Príncipe,et al.  Generalized correlation function: definition, properties, and application to blind equalization , 2006, IEEE Transactions on Signal Processing.

[25]  Yuhang Wang,et al.  rSWTi: A Robust Stationary Wavelet Denoising Method for Array CGH Data , 2007, 2007 IEEE 7th International Symposium on BioInformatics and BioEngineering.

[26]  Michele Ceccarelli,et al.  VEGA: variational segmentation for copy number detection , 2010, Bioinform..

[27]  Ajay N. Jain,et al.  Genomic and transcriptional aberrations linked to breast cancer pathophysiologies. , 2006, Cancer cell.

[28]  Christian A. Rees,et al.  Microarray analysis reveals a major direct role of DNA copy number alteration in the transcriptional program of human breast tumors , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[29]  Donald Geman,et al.  Constrained Restoration and the Recovery of Discontinuities , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[30]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[31]  Franck Picard,et al.  A statistical approach for array CGH data analysis , 2005, BMC Bioinformatics.

[32]  Yi Li,et al.  Bayesian Hidden Markov Modeling of Array CGH Data , 2008, Journal of the American Statistical Association.