Mining Gene Expression Data of Multiple Sclerosis

Objectives Microarray produces a large amount of gene expression data, containing various biological implications. The challenge is to detect a panel of discriminative genes associated with disease. This study proposed a robust classification model for gene selection using gene expression data, and performed an analysis to identify disease-related genes using multiple sclerosis as an example. Materials and methods Gene expression profiles based on the transcriptome of peripheral blood mononuclear cells from a total of 44 samples from 26 multiple sclerosis patients and 18 individuals with other neurological diseases (control) were analyzed. Feature selection algorithms including Support Vector Machine based on Recursive Feature Elimination, Receiver Operating Characteristic Curve, and Boruta algorithms were jointly performed to select candidate genes associating with multiple sclerosis. Multiple classification models categorized samples into two different groups based on the identified genes. Models’ performance was evaluated using cross-validation methods, and an optimal classifier for gene selection was determined. Results An overlapping feature set was identified consisting of 8 genes that were differentially expressed between the two phenotype groups. The genes were significantly associated with the pathways of apoptosis and cytokine-cytokine receptor interaction. TNFSF10 was significantly associated with multiple sclerosis. A Support Vector Machine model was established based on the featured genes and gave a practical accuracy of ∼86%. This binary classification model also outperformed the other models in terms of Sensitivity, Specificity and F1 score. Conclusions The combined analytical framework integrating feature ranking algorithms and Support Vector Machine model could be used for selecting genes for other diseases.

[1]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[2]  Yong Qian,et al.  Hybrid Models Identified a 12-Gene Signature for Lung Cancer Prognosis and Chemoresponse Prediction , 2010, PloS one.

[3]  J. Lünemann,et al.  TNF-related apoptosis inducing ligand (TRAIL) as a potential response marker for interferon-beta treatment in multiple sclerosis , 2003, The Lancet.

[4]  D. Altmann Evaluating the evidence for multiple sclerosis as an autoimmune disease. , 2005, Archives of neurology.

[5]  T. Speed,et al.  Summaries of Affymetrix GeneChip probe level data. , 2003, Nucleic acids research.

[6]  J. Satoh,et al.  Microarray analysis identifies an aberrant expression of apoptosis and DNA damage-regulatory genes in multiple sclerosis , 2005, Neurobiology of Disease.

[7]  Geoffrey J. McLachlan,et al.  Analyzing Microarray Gene Expression Data , 2004 .

[8]  J. Trent,et al.  Analysis of gene expression in multiple sclerosis lesions using cDNA microarrays , 1999 .

[9]  W. F. Azevedo MolDock applied to structure-based virtual screening. , 2010 .

[10]  M. Schummer,et al.  Selecting Differentially Expressed Genes from Microarray Experiments , 2003, Biometrics.

[11]  W. F. de Azevedo,et al.  Molecular model for the binary complex of uropepsin and pepstatin. , 2001, Biochemical and biophysical research communications.

[12]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[13]  Frauke Zipp,et al.  Neuronal Damage in Autoimmune Neuroinflammation Mediated by the Death Ligand TRAIL , 2005, Neuron.

[14]  Zhi-Qin Xi,et al.  HSPBAP1 is found extensively in the anterior temporal neocortex of patients with intractable epilepsy , 2007, Synapse.

[15]  X H Zhou,et al.  Building a disease risk model of osteoporosis based on traditional Chinese medicine symptoms and western medicine risk factors , 2012, Statistics in medicine.

[16]  Jean YH Yang,et al.  Bioconductor: open software development for computational biology and bioinformatics , 2004, Genome Biology.

[17]  S H Kim,et al.  Structural basis for specificity and potency of a flavonoid inhibitor of human CDK2, a cell cycle kinase. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[18]  F. Canduri,et al.  Molecular model of cyclin-dependent kinase 5 complexed with roscovitine. , 2002, Biochemical and biophysical research communications.

[19]  S. Kikuchi,et al.  TNF-related apoptosis inducing ligand (TRAIL) gene polymorphism in Japanese patients with multiple sclerosis , 2005, Journal of Neuroimmunology.

[20]  Jeffrey T. Chang,et al.  GATHER: a systems approach to interpreting genomic signatures , 2006, Bioinform..

[21]  Walter Filgueira de Azevedo Structure-Based Virtual Screening , 2010 .

[22]  T. Wienker,et al.  Identification and functional characterization of a highly polymorphic region in the human TRAIL promoter in multiple sclerosis , 2004, Journal of Neuroimmunology.

[23]  Ludwig Kappos,et al.  Multiple sclerosis as a generalized CNS disease—comparative microarray analysis of normal appearing white matter and lesions in secondary progressive MS , 2004, Journal of Neuroimmunology.

[24]  Aixia Guo,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2014 .

[25]  W. L. Benedict,et al.  Multiple Sclerosis , 2007, Journal - Michigan State Medical Society.

[26]  M. Buttmann,et al.  TRAIL, CXCL10 and CCL2 plasma levels during long-term Interferon-β treatment of patients with multiple sclerosis correlate with flu-like adverse effects but do not predict therapeutic response , 2007, Journal of Neuroimmunology.

[27]  M. Ramanathan,et al.  Interferon-beta modulates bone-associated cytokines and osteoclast precursor activity in multiple sclerosis patients. , 2006, Multiple sclerosis.

[28]  A. Brinkmann,et al.  Structure and function of GC79/TRPS1, a novel androgen-repressible apoptosis gene , 2002, Apoptosis.

[29]  C A Smith,et al.  Identification and characterization of a new member of the TNF family that induces apoptosis. , 1995, Immunity.

[30]  M. Ramanathan,et al.  Interferon-β modulates bone-associated cytokines and osteoclast precursor activity in multiple sclerosis patients , 2006 .

[31]  W. F. de Azevedo,et al.  Bio-inspired algorithms applied to molecular docking simulations. , 2011, Current medicinal chemistry.

[32]  W. F. de Azevedo,et al.  Structural basis for inhibition of cyclin-dependent kinase 9 by flavopiridol. , 2002, Biochemical and biophysical research communications.

[33]  Stan Szpakowicz,et al.  Beyond Accuracy, F-Score and ROC: A Family of Discriminant Measures for Performance Evaluation , 2006, Australian Conference on Artificial Intelligence.

[34]  Jan Hillert,et al.  Gene expression profiling in multiple sclerosis: A disease of the central nervous system, but with relapses triggered in the periphery? , 2010, Neurobiology of Disease.

[35]  M. Weller,et al.  Immune (dys)regulation in multiple sclerosis: role of the CD95-CD95 ligand system. , 1999, Immunology today.

[36]  J. Benito-León,et al.  TRAIL/TRAIL Receptor System and Susceptibility to Multiple Sclerosis , 2011, PloS one.

[37]  S H Kim,et al.  Inhibition of cyclin-dependent kinases by purine analogues: crystal structure of human cdk2 complexed with roscovitine. , 1997, European journal of biochemistry.

[38]  Bianca Villavicencio,et al.  Recent Progress of Molecular Docking Simulations Applied to Development of Drugs , 2012 .

[39]  W. Xie,et al.  Two human cDNAs, including a homolog of Arabidopsis FUS6 (COP11), suppress G-protein- and mitogen-activated protein kinase-mediated signal transduction in yeast and mammalian cells , 1996, Molecular and cellular biology.

[40]  Javed Khan,et al.  Gene expression profile in multiple sclerosis patients and healthy controls: identifying pathways relevant to disease. , 2003, Human molecular genetics.

[41]  W. F. de Azevedo,et al.  Structure of human uropepsin at 2.45 A resolution. , 2001, Acta crystallographica. Section D, Biological crystallography.

[42]  Donna K. Slonim,et al.  Getting Started in Gene Expression Microarray Analysis , 2009, PLoS Comput. Biol..

[43]  Witold R. Rudnicki,et al.  Feature Selection with the Boruta Package , 2010 .

[44]  W. F. de Azevedo MolDock applied to structure-based virtual screening. , 2010, Current drug targets.