A Compressed Sensing Based Approach for Subtyping of Leukemia from gene Expression Data

With the development of genomic techniques, the demand for new methods that can handle high-throughput genome-wide data effectively is becoming stronger than ever before. Compressed sensing (CS) is an emerging approach in statistics and signal processing. With the CS theory, a signal can be uniquely reconstructed or approximated from its sparse representations, which can therefore better distinguish different types of signals. However, the application of CS approach to genome-wide data analysis has been rarely investigated. We propose a novel CS-based approach for genomic data classification and test its performance in the subtyping of leukemia through gene expression analysis. The detection of subtypes of cancers such as leukemia according to different genetic markups is significant, which holds promise for the individualization of therapies and improvement of treatments. In our work, four statistical features were employed to select significant genes for the classification. With our selected genes out of 7,129 ones, the proposed CS method achieved a classification accuracy of 97.4% when evaluated with the cross validation and 94.3% when evaluated with another independent data set. The robustness of the method to noise was also tested, giving good performance. Therefore, this work demonstrates that the CS method can effectively detect subtypes of leukemia, implying improved accuracy of diagnosis of leukemia.

[1]  Hongyun Zhang,et al.  Efficient Gene Selection with Rough Sets from Gene Expression Data , 2008, RSKT.

[2]  M. Leccia,et al.  Role of zyxin in differential cell spreading and proliferation of melanoma cells and melanocytes. , 2002, The Journal of investigative dermatology.

[3]  A. Telenius,et al.  High-resolution whole genome tiling path array CGH analysis of CD34+ cells from patients with low-risk myelodysplastic syndromes reveals cryptic copy number alterations and predicts overall and leukemia-free survival. , 2008, Blood.

[4]  Richard G. Baraniuk,et al.  The smashed filter for compressive classification and target recognition , 2007, Electronic Imaging.

[5]  J. Downing,et al.  Classification of pediatric acute lymphoblastic leukemia by gene expression profiling. , 2003, Blood.

[6]  A. Gholami,et al.  Regularization of linear and non-linear geophysical ill-posed problems with joint sparsity constraints , 2010 .

[7]  J. Downing,et al.  Classification, subtype discovery, and prediction of outcome in pediatric acute lymphoblastic leukemia by gene expression profiling. , 2002, Cancer cell.

[8]  Hongbao Cao,et al.  Integrated Analysis of Gene Expression and Copy Number Data using Sparse Representation Based Clustering Model , 2011, BICoB.

[9]  Torsten Haferlach,et al.  Microarray-based classifiers and prognosis models identify subgroups with distinct clinical outcomes and high risk of AML transformation of myelodysplastic syndrome. , 2009, Blood.

[10]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[11]  Sunghoon Kwon,et al.  Multiclass sparse logistic regression for classification of multiple cancer types using gene expression data , 2006, Comput. Stat. Data Anal..

[12]  X.-C. Xie,et al.  High-resolution imaging of moving train by ground-based radar with compressive sensing , 2010 .

[13]  Robert Tibshirani,et al.  An Introduction to the Bootstrap , 1994 .

[14]  Richard G. Baraniuk,et al.  Compressive Sensing DNA Microarrays , 2008, EURASIP J. Bioinform. Syst. Biol..

[15]  Hongbao Cao,et al.  M-Fish Image Analysis with Improved Adaptive Fuzzy C-Means Clustering Based Segmentation and Sparse Representation Classification , 2011, BICoB.

[16]  Richard G. Baraniuk,et al.  Detection and estimation with compressive measurements , 2006 .

[17]  D. Donoho,et al.  Sparse MRI: The application of compressed sensing for rapid MR imaging , 2007, Magnetic resonance in medicine.

[18]  E.J. Candes,et al.  An Introduction To Compressive Sampling , 2008, IEEE Signal Processing Magazine.

[19]  Richard G. Baraniuk,et al.  Compressive Sensing , 2008, Computer Vision, A Reference Guide.

[20]  S. Dudoit,et al.  Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data , 2002 .

[21]  Xuegong Zhang,et al.  ALL/AML Cancer Classification by Gene Expression Data Using SVM and CSVM Approach , 2000 .

[22]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.