Mapping proteins to disease terminologies: from UniProt to MeSH

BackgroundAlthough the UniProt KnowledgeBase is not a medical-oriented database, it contains information on more than 2,000 human proteins involved in pathologies. However, these annotations are not standardized, which impairs the interoperability between biological and clinical resources. In order to make these data easily accessible to clinical researchers, we have developed a procedure to link diseases described in the UniProtKB/Swiss-Prot entries to the MeSH disease terminology.ResultsWe mapped disease names extracted either from the UniProtKB/Swiss-Prot entry comment lines or from the corresponding OMIM entry to the MeSH. Different methods were assessed on a benchmark set of 200 disease names manually mapped to MeSH terms. The performance of the retained procedure in term of precision and recall was 86% and 64% respectively. Using the same procedure, more than 3,000 disease names in Swiss-Prot were mapped to MeSH with comparable efficiency.ConclusionsThis study is a first attempt to link proteins in UniProtKB to the medical resources. The indexing we provided will help clinicians and researchers navigate from diseases to genes and from genes to diseases in an efficient way. The mapping is available at: http://research.isb-sib.ch/unimed.

[1]  Rong Chen,et al.  Finding Disease-Related Genomic Experiments Within an International Repository: First Steps in Translational Bioinformatics , 2006, AMIA.

[2]  Daniel L. Rubin,et al.  Annotation and query of tissue microarray data using the NCI Thesaurus , 2007, BMC Bioinformatics.

[3]  Alan F. Scott,et al.  Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders , 2002, Nucleic Acids Res..

[4]  G. Vriend,et al.  A text-mining analysis of the human phenome , 2006, European Journal of Human Genetics.

[5]  Pradeep Ravikumar,et al.  Adaptive Name Matching in Information Integration , 2003, IEEE Intell. Syst..

[6]  Michael Ashburner,et al.  Ontologies for biologists: a community model for the annotation of genomic data. , 2003 .

[7]  Kevin Donnelly,et al.  SNOMED-CT: The advanced terminology and coding system for eHealth. , 2006, Studies in health technology and informatics.

[8]  Patrick Ruch,et al.  Automatic assignment of biomedical categories: toward a generic approach , 2006, Bioinform..

[9]  Olivier Bodenreider,et al.  An Evaluation of Hybrid Methods for Matching Biomedical Terminologies: Mapping the Gene Ontology to the UMLS®$ , 2003, MIE.

[10]  Olivier Bodenreider,et al.  The Unified Medical Language System (UMLS): integrating biomedical terminology , 2004, Nucleic Acids Res..

[11]  Pradeep Ravikumar,et al.  A Comparison of String Distance Metrics for Name-Matching Tasks , 2003, IIWeb.

[12]  Zhiyong Lu,et al.  Evaluation of Lexical Methods for Detecting Relationships Between Concepts from Multiple Ontologies , 2006, Pacific Symposium on Biocomputing.

[13]  Gene Ontology Consortium,et al.  The Gene Ontology (GO) project in 2006 , 2005, Nucleic Acids Res..

[14]  Guliaev Va,et al.  [Principles of diagnosis formulation based on requirements of the International statistical classification of diseases and health-related problems] , 2000 .

[15]  A. Butte,et al.  Creation and implications of a phenome-genome network , 2006, Nature Biotechnology.

[16]  I N Sarkar,et al.  Linking biomedical language information and knowledge resources: GO and UMLS. , 2002, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[17]  Yves A. Lussier,et al.  Terminological Mapping for High Throughput Comparative Biology of Phenotypes , 2003, Pacific Symposium on Biocomputing.

[18]  Hagit Shatkay,et al.  Hairpins in bookstacks: Information retrieval from biomedical text , 2005, Briefings Bioinform..

[19]  Olivier Bodenreider,et al.  GenesTrace: Phenomic Knowledge Discovery via Structured Terminology , 2004, Pacific Symposium on Biocomputing.

[20]  Alan R. Aronson,et al.  Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program , 2001, AMIA.

[21]  Olivier Bodenreider,et al.  Comparing two approaches for aligning representations of anatomy , 2007, Artif. Intell. Medicine.

[22]  Stuart J. Nelson,et al.  The MeSH Translation Maintenance System: Structure, Interface Design, and Implementation , 2004, MedInfo.