Sparse Manifold Clustering and Embedding to discriminate gene expression profiles of glioblastoma and meningioma tumors

Sparse Manifold Clustering and Embedding (SMCE) algorithm has been recently proposed for simultaneous clustering and dimensionality reduction of data on nonlinear manifolds using sparse representation techniques. In this work, SMCE algorithm is applied to the differential discrimination of Glioblastoma and Meningioma Tumors by means of their Gene Expression Profiles. Our purpose was to evaluate the robustness of this nonlinear manifold to classify gene expression profiles, characterized by the high-dimensionality of their representations and the low discrimination power of most of the genes. For this objective, we used SMCE to reduce the dimensionality of a preprocessed dataset of 35 single-labeling cDNA microarrays with 11500 original clones. Afterwards, supervised and unsupervised methodologies were applied to obtain the classification model: the former was based on linear discriminant analysis, the later on clustering using the SMCE embedding data. The results obtained using both approaches showed that all (100%) the samples could be correctly classified and the results of all repetitions but one formed a compatible cluster of predictive labels. Finally, the embedding dimensionality of the dataset extracted by SMCE revealed large discrimination margins between both classes.

[1]  K. Hoang-Xuan,et al.  Primary brain tumours in adults , 2003, The Lancet.

[2]  Terence P. Speed,et al.  A comparison of normalization methods for high density oligonucleotide array data based on variance and bias , 2003, Bioinform..

[3]  A. Iafrate,et al.  Prospective, high-throughput molecular profiling of human gliomas , 2012, Journal of Neuro-Oncology.

[4]  Sorin Drăghici,et al.  Data Analysis Tools for DNA Microarrays , 2003 .

[5]  Wei Zhang,et al.  Molecular Classification of Human Diffuse Gliomas by Multidimensional Scaling Analysis of Gene Expression Profiles Parallels Morphology‐Based Classification, Correlates with Survival, and Reveals Clinically‐Relevant Novel Glioma Subsets , 2002, Brain pathology.

[6]  Momiao Xiong,et al.  Manifold Learning for Human Population Structure Studies , 2012, PloS one.

[7]  Michel Verleysen,et al.  Nonlinear Dimensionality Reduction , 2021, Computer Vision.

[8]  Ethem Alpaydin,et al.  Introduction to machine learning , 2004, Adaptive computation and machine learning.

[9]  J. Paramio,et al.  Gene expression profiling as a tool for basic analysis and clinical application of human cancer , 2008, Molecular carcinogenesis.

[10]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[11]  Hong Huang,et al.  Gene Classification Using Parameter-Free Semi-Supervised Manifold Learning , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[12]  David Edwards,et al.  Non-linear Normalization and Background Correction in One-channel CDNA Microarray Studies , 2003, Bioinform..

[13]  K. Lamszus,et al.  Meningioma Pathology, Genetics, and Biology , 2004, Journal of neuropathology and experimental neurology.

[14]  Shanwen Zhang,et al.  A supervised orthogonal discriminant projection for tumor classification using gene expression data , 2013, Comput. Biol. Medicine.

[15]  Barbara J. Wold,et al.  Mining gene expression data by interpreting principal components , 2006, BMC Bioinformatics.

[16]  Hanlee P. Ji,et al.  The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements. , 2006, Nature biotechnology.

[17]  René Vidal,et al.  Sparse Manifold Clustering and Embedding , 2011, NIPS.

[18]  Kim-Anh Lê Cao,et al.  Independent Principal Component Analysis for biologically meaningful dimension reduction of large biological data sets , 2012, BMC Bioinformatics.

[19]  Juan Miguel García-Gómez,et al.  Automated Brain Tumor Biopsy Prediction Using Single-labeling cDNA Microarrays-based Gene Expression Profiling , 2009, Diagnostic molecular pathology : the American journal of surgical pathology, part B.

[20]  Anil K. Jain,et al.  Bootstrap Techniques for Error Estimation , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Kevin C. Dorff,et al.  The MicroArray Quality Control (MAQC)-II study of common practices for the development and validation of microarray-based predictive models , 2010, Nature Biotechnology.

[22]  Nicholas F. Marko,et al.  Integrated molecular analysis suggests a three-class model for low-grade gliomas: a proof-of-concept study. , 2010, Genomics.

[23]  Chun Chen,et al.  Relational Multimanifold Coclustering , 2013, IEEE Transactions on Cybernetics.