Cancer Classification Analysis for Microarray Gene Expression Data by Integrating Wavelet Transform and Visual Analysis

Cancer classification using microarray gene expression data has received high interest because of its capability of performing cancer diagnosis computationally. However, researchers often faced difficulty in analyzing the data to achieve an accurate cancer diagnosis due to the size and noise issues in the microarray gene expression data. Therefore, it is important to perform feature extraction procedure to enhance the performance in cancer diagnosis. In this study, an approach is proposed by integrating wavelet-based feature extraction and visual analysis for cancer classification. Feature extraction is performed with wavelet transform and validates with a statistical test to determine only statistically valuable features. Visual analysis is also conducted to inspect not only the distribution of features but also the patterns of the cancer data. With cancer datasets, the performances of three machine learning (ML) algorithms for cancer classification are measured to show the effectiveness of the approach. From the performance evaluation study, we found that our approach has an ability to classifying cancers accurately.

[1]  M. Bøgsted,et al.  High CXCR4 expression impairs rituximab response and the prognosis of R-CHOP-treated diffuse large B-cell lymphoma patients , 2019, OncoTarget.

[2]  Duncan Fyfe Gillies,et al.  A Review of Feature Selection and Feature Extraction Methods Applied on Microarray Data , 2015, Adv. Bioinformatics.

[3]  Yihui Liu,et al.  Detect Key Gene Information in Classification of Microarray Data , 2008, EURASIP J. Adv. Signal Process..

[4]  Yuichi Takiguchi,et al.  A Microarray-Based Gene Expression Analysis to Identify Diagnostic Biomarkers for Unknown Primary Cancer , 2013, PloS one.

[5]  José Antonio Lozano,et al.  Sensitivity Analysis of k-Fold Cross Validation in Prediction Error Estimation , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Manuela Gariboldi,et al.  Effects of Warm Ischemic Time on Gene Expression Profiling in Colorectal Cancer Tissues and Normal Mucosa , 2013, PloS one.

[7]  S. Mallat A wavelet tour of signal processing , 1998 .

[8]  E. Lander,et al.  Gene expression correlates of clinical prostate cancer behavior. , 2002, Cancer cell.

[9]  Kannan Arputharaj,et al.  A Discrete Wavelet Based Feature Extraction and Hybrid Classification Technique for Microarray Data Analysis , 2014, TheScientificWorldJournal.

[10]  Gregory C. Thornwall,et al.  The microarray explorer tool for data mining of cDNA microarrays: application for the mammary gland. , 2000, Nucleic acids research.

[11]  Abdulhamit Subasi,et al.  Surface EMG signal classification using ternary pattern and discrete wavelet transform based feature extraction for hand movement recognition , 2020, Biomed. Signal Process. Control..

[12]  Expression of NOTCH3 exon 16 differentiates Diffuse Large B-cell Lymphoma into molecular subtypes and is associated with prognosis , 2019, Scientific Reports.

[13]  Bin Yu,et al.  Feature selection of gene expression data for Cancer classification using double RBF-kernels , 2018, BMC Bioinformatics.

[14]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[15]  Q Guihong,et al.  Medical image fusion by wavelet transform modulus maxima. , 2001, Optics express.

[16]  Zixiang Xiong,et al.  Optimal number of features as a function of sample size for various classification rules , 2005, Bioinform..

[17]  P. Gillen,et al.  Genomic and oncoproteomic advances in detection and treatment of colorectal cancer , 2009, World journal of surgical oncology.

[18]  C. Burrus,et al.  Introduction to Wavelets and Wavelet Transforms: A Primer , 1997 .

[19]  Chao Chen,et al.  Using Random Forest to Learn Imbalanced Data , 2004 .

[20]  Singh Vijendra,et al.  A Novel Hybrid Approach for Chronic Disease Classification , 2020, Int. J. Heal. Inf. Syst. Informatics.

[21]  Shutao Li,et al.  Wavelet-Based Feature Extraction for Microarray Data Classification , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[22]  Yihui Liu,et al.  Wavelet feature extraction for high-dimensional microarray data , 2009, Neurocomputing.

[23]  Daniel Weiskopf,et al.  State of the Art of Parallel Coordinates , 2013, Eurographics.

[24]  Rachhpal Singh,et al.  A Gene Expression Data Classification and Selection Method using Hybrid Meta-heuristic technique , 2018, EAI Endorsed Trans. Scalable Inf. Syst..

[25]  Yihui Liu,et al.  Wavelet feature selection for microarray data , 2007, 2007 IEEE/NIH Life Science Systems and Applications Workshop.

[26]  Mario Mastriani,et al.  Microarrays Denoising via Smoothing of Coefficients in Wavelet Domain , 2007, 1807.11571.

[27]  Yanmei Xu,et al.  Transcriptome profiling reveals the high incidence of hnRNPA1 exon 8 inclusion in chronic myeloid leukemia , 2020, Journal of advanced research.