Enrichment analysis applied to disease prognosis

AbstractEnrichment analysis is well established in the field of transcriptomics, where it is used to identify relevant biological features that characterize a set of genes obtained in an experiment.This article proposes the application of enrichment analysis as a first step in a disease prognosis methodology, in particular of diseases with a strong genetic component. With this analysis the objective is to identify clinical and biological features that characterize groups of patients with a common disease, and that can be used to distinguish between groups of patients associated with disease-related events. Data mining methodologies can then be used to exploit those features, and assist medical doctors in the evaluation of the patients in respect to their predisposition for a specific event.In this work the disease hypertrophic cardiomyopathy (HCM) is used as a case-study, as a first test to assess the feasibility of the application of an enrichment analysis to disease prognosis. To perform this assessment, two groups of patients have been considered: patients that have suffered a sudden cardiac death episode and patients that have not.The results presented were obtained with genetic data and the Gene Ontology, in two enrichment analyses: an enrichment profiling aiming at characterizing a group of patients (e.g. that suffered a disease-related event) based on their mutations; and a differential enrichment aiming at identifying differentiating features between a sub-group of patients and all the patients with the disease. These analyses correspond to an adaptation of the standard enrichment analysis, since multiple sets of genes are being considered, one for each patient.The preliminary results are promising, as the sets of terms obtained reflect the current knowledge about the gene functions commonly altered in HCM patients, thus allowing their characterization. Nevertheless, some factors need to be taken into consideration before the full potential of the enrichment analysis in the prognosis methodology can be evaluated. One of such factors is the need to test the enrichment analysis with clinical data, in addition to genetic data, since both types of data are expected to be necessary for prognosis purposes.

[1]  J. Towbin,et al.  Diagnostic, prognostic, and therapeutic implications of genetic testing for hypertrophic cardiomyopathy. , 2009, Journal of the American College of Cardiology.

[2]  Rachael P. Huntley,et al.  The GOA database in 2009—an integrated Gene Ontology Annotation resource , 2008, Nucleic Acids Res..

[3]  J. Seidman,et al.  Genetic Basis of Hypertrophic Cardiomyopathy: From Bench to the Clinics , 2007, Journal of cardiovascular electrophysiology.

[4]  Martin Vingron,et al.  Ontologizer 2.0 - a multifunctional tool for GO term enrichment analysis and data exploration , 2008, Bioinform..

[5]  Sherri de Coronado,et al.  NCI Thesaurus: A semantic model integrating cancer-related clinical and molecular information , 2007, J. Biomed. Informatics.

[6]  Jing Cao,et al.  GO-Bayes: Gene Ontology-based overrepresentation analysis using a Bayesian approach , 2010, Bioinform..

[7]  J. Michael Hardin,et al.  Data Mining and Clinical Decision Support Systems , 2007 .

[8]  Carole A. Goble,et al.  Investigating Semantic Similarity Measures Across the Gene Ontology: The Relationship Between Sequence and Annotation , 2003, Bioinform..

[9]  Carl T Wittwer,et al.  High-resolution genotyping by amplicon melting analysis using LCGreen. , 2003, Clinical chemistry.

[10]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[11]  E. Berner,et al.  Clinical Decision Support Systems: Theory and Practice , 1998 .

[12]  David Kipling,et al.  Text-based over-representation analysis of microarray gene lists with annotation bias , 2009, Nucleic acids research.

[13]  João D. Ferreira,et al.  Generic Semantic Relatedness Measure for Biomedical Ontologies , 2011, ICBO.

[14]  Mark A. Musen,et al.  Enabling enrichment analysis with the Human Disease Ontology , 2011, J. Biomed. Informatics.

[15]  Peter N. Robinson,et al.  Introduction to Bio-Ontologies , 2011 .

[16]  Phillip W. Lord,et al.  Semantic Similarity in Biomedical Ontologies , 2009, PLoS Comput. Biol..

[17]  E. Braunwald,et al.  The 50-year history, controversy, and clinical implications of left ventricular outflow tract obstruction in hypertrophic cardiomyopathy from idiopathic hypertrophic subaortic stenosis to hypertrophic cardiomyopathy: from idiopathic hypertrophic subaortic stenosis to hypertrophic cardiomyopathy. , 2009, Journal of the American College of Cardiology.

[18]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[19]  Peter N. Robinson,et al.  GOing Bayesian: model-based gene set analysis of genome-scale data , 2010, Nucleic acids research.

[20]  Carolyn Y. Ho,et al.  Is Genotype Clinically Useful in Predicting Prognosis in Hypertrophic Cardiomyopathy? Genetics and Clinical Destiny: Improving Care in Hypertrophic Cardiomyopathy Response by Landstrom on P 2440 Genetics of Hcm Controversies in Cardiovascular Medicine , 2022 .

[21]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[22]  Dietrich Rebholz-Schuhmann,et al.  Gene Regulation Ontology (GRO): Design Principles and Use Cases , 2008, MIE.

[23]  Brad T. Sherman,et al.  Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists , 2008, Nucleic acids research.

[24]  David Martin,et al.  GOToolBox: functional analysis of gene datasets based on Gene Ontology , 2004, Genome Biology.

[25]  R. Durbin,et al.  The Sequence Ontology: a tool for the unification of genome annotations , 2005, Genome Biology.

[26]  Michel Dumontier,et al.  Identifying aberrant pathways through integrated analysis of knowledge in pharmacogenomics , 2012, Bioinform..

[27]  P. Khatri,et al.  Profiling gene expression using onto-express. , 2002, Genomics.

[28]  A MusenMark,et al.  Enabling enrichment analysis with the Human Disease Ontology , 2011 .

[29]  Francisco M. Couto,et al.  Toward a Translational Medicine Approach for Hypertrophic Cardiomyopathy , 2012, ITBAM.