Identification of glioma cancer-alerted gene markers based on a diagnostic outcome correlation analysis preferential approach

Identifying glioma cancer-alerted genetic markers through analysis of microarray data allows us to detect tumours at the genome-wide level. To this end, we propose to identify glioma gene markers based primarily on their correlation with the glioma diagnostic outcomes, rather than merely on the classification quality or differential expression levels, as it is not the classification or expression level per se that is crucial, but the selection of biologically relevant biomarkers is the most important issue. With the help of singular value decomposition, microarray data are decomposed and the eigenvectors corresponding to the biological effect of diagnostic outcomes are identified. Genes that play important roles in determining this biological effect are thus detected. Therefore, genes are essentially identified in terms of their strength of association with diagnostic outcomes. Monte Carlo simulations are then used to fine tune the selected gene set in terms of classification accuracy. Experiments show that the proposed method achieves better classification accuracies and is data sets independent. Graph-based statistical analysis showed that the selected genes have close relationships with glioma diagnostic outcomes. Further biological database and literature study confirms that the identified genes are biologically relevant.

[1]  Chris H. Q. Ding,et al.  Minimum Redundancy Feature Selection from Microarray Gene Expression Data , 2005, J. Bioinform. Comput. Biol..

[2]  Kaushik Mahata,et al.  Selecting differentially expressed genes using minimum probability of classification error , 2007, J. Biomed. Informatics.

[3]  Hitoshi Iba,et al.  Prediction of Cancer Class with Majority Voting Genetic Programming Classifier Using Gene Expression Data , 2009, TCBB.

[4]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[5]  Karuturi R. Krishna Murthy,et al.  Significance analysis and improved discovery of disease-specific Differentially Co-expressed Gene Sets in microarray data , 2010, Int. J. Data Min. Bioinform..

[6]  R. Kreienberg,et al.  Expression of tetraspanin adaptor proteins below defined threshold values is associated with in vitro invasiveness of mammary carcinoma cells. , 2003, Oncology reports.

[7]  Mario Rosario Guarracino,et al.  A Parallel Classification and Feature Reduction Method for Biomedical Applications , 2007, PPAM.

[8]  Xiaoli Li,et al.  Multi-resolution independent component analysis for high-performance tumor classification and biomarker discovery , 2011, BMC Bioinformatics.

[9]  W. Yung,et al.  Reactivation of insulin-like growth factor binding protein 2 expression in glioblastoma multiforme: a revelation by parallel gene expression profiling. , 1999, Cancer research.

[10]  Pedro Larrañaga,et al.  A review of feature selection techniques in bioinformatics , 2007, Bioinform..

[11]  Connie R. Jimenez,et al.  iTRAQ-based Proteomics Profiling Reveals Increased Metabolic Activity and Cellular Cross-talk in Angiogenic Compared with Invasive Glioblastoma Phenotype* , 2009, Molecular & Cellular Proteomics.

[12]  Fang-Xiang Wu,et al.  Sparse Representation for Classification of Tumors Using Gene Expression Data , 2009, Journal of biomedicine & biotechnology.

[13]  H. Ohgaki,et al.  Epidemiology of brain tumors. , 2009, Methods in molecular biology.

[14]  Osamu Koiwai,et al.  [Expression and function of terminal deoxynucleotidyl-transferase and discovery of novel DNA polymerase mu]. , 2002, Seikagaku. The Journal of Japanese Biochemical Society.

[15]  Sounak Chakraborty,et al.  Computational Statistics and Data Analysis Simultaneous Cancer Classification and Gene Selection with Bayesian Nearest Neighbor Method: an Integrated Approach , 2022 .

[16]  D. V. Vactor,et al.  The Trio family of guanine-nucleotide-exchange factors: regulators of axon guidance. , 2001, Journal of cell science.

[17]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[18]  Jan Komorowski,et al.  BIOINFORMATICS ORIGINAL PAPER doi:10.1093/bioinformatics/btm486 Data and text mining Monte Carlo , 2022 .

[19]  Luis Mateus Rocha,et al.  Singular value decomposition and principal component analysis , 2003 .

[20]  W. Nelson,et al.  Bench to bedside and back again: molecular mechanisms of alpha-catenin function and roles in tumorigenesis. , 2008, Seminars in cancer biology.

[21]  Dong-Guk Shin,et al.  Meta analysis algorithms for microarray gene expression data using Gene Regulatory Networks , 2010, Int. J. Data Min. Bioinform..

[22]  Graham R. Ball,et al.  Identification of gene transcript signatures predictive for estrogen receptor and lymph node status using a stepwise forward selection artificial neural network modelling approach , 2008, Artif. Intell. Medicine.

[23]  Maximino Redondo,et al.  The role of clusterin (CLU) in malignant transformation and drug resistance in breast carcinomas. , 2009, Advances in cancer research.

[24]  Lei Zhang,et al.  Locally Linear Discriminant Embedding for Tumor Classification , 2008, ICIC.

[25]  Y. Kanai,et al.  Expression of N-cadherin and alpha-catenin in astrocytomas and glioblastomas. , 1995, British Journal of Cancer.

[26]  D. Nowak,et al.  Actin in human colon adenocarcinoma cells with different metastatic potential. , 2002, Acta biochimica Polonica.

[27]  S. Ho,et al.  Relationship Between Chemotherapy Response of Small Cell Lung Cancer and P-glycoprotein or Multidrug Resistance-Related Protein Expression , 2002, Lung.

[28]  Amanda Y. Chan,et al.  Tumorigenesis and Neoplastic Progression The Guanine Nucleotide Exchange Factors Trio , Ect 2 , and Vav 3 Mediate the Invasive Behavior of Glioblastoma , 2010 .

[29]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[30]  D K Pearl,et al.  Improving diagnostic accuracy and interobserver concordance in the classification and grading of primary gliomas , 1997, Cancer.

[31]  B. Scheithauer,et al.  The 2007 WHO classification of tumours of the central nervous system , 2007, Acta Neuropathologica.

[32]  Sounak Chakraborty,et al.  Computational Statistics and Data Analysis Bayesian Binary Kernel Probit Model for Microarray Based Cancer Classification and Gene Selection , 2022 .

[33]  D. Louis,et al.  Glioma classification: a molecular reappraisal. , 2001, The American journal of pathology.

[34]  Frank Duffner,et al.  In VivoExpression of Insulin-Like Growth Factor-Binding Protein-2 in Human Gliomas Increases with the Tumor Grade. , 2001, Endocrinology.

[35]  D. Richardson,et al.  The metastasis suppressor, Ndrg-1: a new ally in the fight against cancer. , 2006, Carcinogenesis.

[36]  A. Stepan,et al.  Cathepsin-D expression in breast lesion: an immunohistochemical study. , 2009, Romanian journal of morphology and embryology = Revue roumaine de morphologie et embryologie.

[37]  R. Donato,et al.  Immunocytochemical analyses of annexin V (CaBP33) in a human‐derived glioma cell line , 1993, FEBS letters.

[38]  S. Waldman,et al.  Guanylyl cyclase C agonists regulate progression through the cell cycle of human colon carcinoma cells , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[39]  B. Eren,et al.  MMP-2, TIMP-2 and CD44v6 expression in non-small-cell lung carcinomas. , 2008, Annals of the Academy of Medicine, Singapore.

[40]  Jianping Li,et al.  A multiple kernel support vector machine scheme for feature selection and rule extraction from gene expression data of cancer tissue , 2007, Artif. Intell. Medicine.

[41]  Xue-wen Chen,et al.  Margin-based wrapper methods for gene identification using microarray , 2006, Neurocomputing.

[42]  Slobodan Vucetic,et al.  Improving accuracy of microarray classification by a simple multi-task feature selection filter , 2011, Int. J. Data Min. Bioinform..

[43]  Wei Zhang,et al.  Plasma IGFBP-2 levels predict clinical outcomes of patients with high-grade gliomas. , 2009, Neuro-oncology.

[44]  Ping Zhang,et al.  NDRG1, a growth and cancer related gene: regulation of gene expression and function in normal and disease states. , 2007, Carcinogenesis.

[45]  George C Tseng,et al.  Analysis of integrin alpha7 mutations in prostate cancer, liver cancer, glioblastoma multiforme, and leiomyosarcoma. , 2007, Journal of the National Cancer Institute.

[46]  T. Golub,et al.  Gene expression-based classification of malignant gliomas correlates better with survival than histological classification. , 2003, Cancer research.

[47]  Shutao Li,et al.  Gene Feature Extraction Using T-Test Statistics and Kernel Partial Least Squares , 2006, ICONIP.

[48]  P. Pardalos,et al.  Classification and Characterization of Gene Expression Data with Generalized Eigenvalues , 2009 .

[49]  Mathieu Boissan,et al.  NM23 et les genès Suppresseurs de métastases , 2007 .

[50]  Yuzhang Wu,et al.  Identification of IMPDH2 as a tumor-associated antigen in colorectal cancer using immunoproteomics analysis , 2009, International Journal of Colorectal Disease.

[51]  Constantin F. Aliferis,et al.  A comprehensive evaluation of multicategory classification methods for microarray gene expression cancer diagnosis , 2004, Bioinform..

[52]  Wei Li,et al.  Decreased expression of NDRG1 in glioma is related to tumor progression and survival of patients , 2009, Journal of Neuro-Oncology.

[53]  G. Fuller,et al.  Insulin-like growth factor binding protein 2 enhances glioblastoma invasion by activating invasion-enhancing genes. , 2003, Cancer research.

[54]  Gregory W. Corder,et al.  Nonparametric Statistics for Non-Statisticians: A Step-by-Step Approach , 2009 .

[55]  Juan Liu,et al.  A hybrid filter/wrapper gene selection method for microarray classification , 2004, Proceedings of 2004 International Conference on Machine Learning and Cybernetics (IEEE Cat. No.04EX826).

[56]  Jason Weston,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[57]  Vipin Kumar,et al.  Introduction to Data Mining, (First Edition) , 2005 .

[58]  David P. Kreil,et al.  There is no silver bullet - a guide to low-level data transforms and normalisation methods for microarray data , 2005, Briefings Bioinform..