An ontological modeling approach to cerebrovascular disease studies: The NEUROWEB case

The NEUROWEB project supports cerebrovascular researchers' association studies, intended as the search for statistical correlations between a feature (e.g., a genotype) and a phenotype. In this project the phenotype refers to the patients' pathological state, and thus it is formulated on the basis of the clinical data collected during the diagnostic activity. In order to enhance the statistical robustness of the association inquiries, the project involves four European Union clinical institutions. Each institution provides its proprietary repository, storing patients' data. Although all sites comply with common diagnostic guidelines, they also adopt specific protocols, resulting in partially discrepant repository contents. Therefore, in order to effectively exploit NEUROWEB data for association studies, it is necessary to provide a framework for the phenotype formulation, grounded on the clinical repository content which explicitly addresses the inherent integration problem. To that end, we developed an ontological model for cerebrovascular phenotypes, the NEUROWEB Reference Ontology, composed of three layers. The top-layer (Top Phenotypes) is an expert-based cerebrovascular disease taxonomy. The middle-layer deconstructs the Top Phenotypes into more elementary phenotypes (Low Phenotypes) and general-use medical concepts such as anatomical parts and topological concepts. The bottom-layer (Core Data Set, or CDS) comprises the clinical indicators required for cerebrovascular disorder diagnosis. Low Phenotypes are connected to the bottom-layer (CDS) by specifying what combination of CDS values is required for their existence. Finally, CDS elements are mapped to the local repositories of clinical data. The NEUROWEB system exploits the Reference Ontology to query the different repositories and to retrieve patients characterized by a common phenotype.

[1]  Amedeo Napoli,et al.  SNP-Converter: An Ontology-Based Solution to Reconcile Heterogeneous SNP Descriptions for Pharmacogenomic Studies , 2006, DILS.

[2]  Huaikou Miao,et al.  A Domain Formal Ontology and the Application in Service Component Retrieval , 2006, 2006 International Conference on Software Engineering Advances (ICSEA'06).

[3]  José T. Palma,et al.  Acquisition and Representation of Causal and Temporal Knowledge in Medical Domains , 2003, KES.

[4]  Antonella Bodini,et al.  An Effect of the PAI-1 4G/5G Polymorphism on Cholesterol Levels May Explain Conflicting Associations with Myocardial Infarction and Stroke , 2006, Cerebrovascular Diseases.

[5]  David B. Matchar,et al.  Improving the Reliability of Stroke Subgroup Classification Using the Trial of ORG 10172 in Acute Stroke Treatment (TOAST) Criteria , 2001, Stroke.

[6]  P. Stenson,et al.  Human Gene Mutation Database (HGMD , 2003 .

[7]  M. Olivier A haplotype map of the human genome , 2003, Nature.

[8]  P. Stenson,et al.  Human Gene Mutation Database: towards a comprehensive central mutation database , 2007, Journal of Medical Genetics.

[9]  Ceusters Werner,et al.  Proceedings of the Ninth International Conference on the Principles of Knowledge Representation and Reasoning (KR2004), Whistler, BC, 2-5 June 2004 , 2004 .

[10]  J J Cimino,et al.  Coding Systems in Health Care , 1995, Yearbook of Medical Informatics.

[11]  Miguel García-Remesal,et al.  ONTOFUSION: Ontology-based integration of genomic and clinical databases , 2006, Comput. Biol. Medicine.

[12]  C E Lipscomb,et al.  Medical Subject Headings (MeSH). , 2000, Bulletin of the Medical Library Association.

[13]  Klaus A. Kuhn,et al.  Semantic integration in healthcare networks , 2007, Int. J. Medical Informatics.

[14]  Nicola Guarino,et al.  Proceedings of the International Workshop on Formal Ontology in Conceptual Analysis and Knowledge Representation , 1993 .

[15]  Chaeyoung Lee,et al.  An Interactive Association of Common Sequence Variants in the Neuropeptide Y Gene With Susceptibility to Ischemic Stroke , 2007, Stroke.

[16]  D. Botstein,et al.  Discovering genotypes underlying human phenotypes: past successes for mendelian disease, future approaches for complex disease , 2003, Nature Genetics.

[17]  Suzanna E Lewis,et al.  Gene Ontology: looking backwards and forwards , 2004, Genome Biology.

[18]  William T. F. Goossen,et al.  Using SNOMED CT Codes for Coding Information in Electronic Health Records for Stroke Patients , 2006, MIE.

[19]  J. Meschia,et al.  Clinically Translated Ischemic Stroke Genomics , 2004, Stroke.

[20]  Nigel W. Hardy,et al.  Promoting coherent minimum reporting guidelines for biological and biomedical investigations: the MIBBI project , 2008, Nature Biotechnology.

[21]  Susumu Goto,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 2000, Nucleic Acids Res..

[22]  Maurizio Vincini,et al.  Synthesizing an Integrated Ontology , 2003, IEEE Internet Comput..

[23]  Diego Calvanese,et al.  The Description Logic Handbook: Theory, Implementation, and Applications , 2003, Description Logic Handbook.

[24]  Dekang Lin,et al.  WordNet: An Electronic Lexical Database , 1998 .

[25]  R. Myers,et al.  Candidate-gene approaches for studying complex genetic traits: practical considerations , 2002, Nature Reviews Genetics.

[26]  Ulrike Sattler Description Logics for the Representation of Aggregated Objects , 2000, ECAI.

[27]  David Lee Gordon,et al.  Classification of Subtype of Acute Ischemic Stroke: Definitions for Use in a Multicenter Clinical Trial , 1993, Stroke.

[28]  Kevin Donnelly,et al.  SNOMED-CT: The advanced terminology and coding system for eHealth. , 2006, Studies in health technology and informatics.

[29]  Christopher G Chute,et al.  National Center for Biomedical Ontology: advancing biomedicine through structured organization of scientific knowledge. , 2006, Omics : a journal of integrative biology.

[30]  Cynthia L. Smith,et al.  The Mammalian Phenotype Ontology as a tool for annotating, analyzing and comparing phenotypic information , 2004, Genome Biology.

[31]  Yao Sun,et al.  Methods for automated concept mapping between medical databases , 2004, J. Biomed. Informatics.

[32]  Yuval Shahar,et al.  Timing Is Everything: Temporal Reasoning and Temporal Data Maintenance in Medicine , 1999, AIMDM.

[33]  James Geller,et al.  Analysis of a Study of the Users, Uses, and Future Agenda of the UMLS , 2007, J. Am. Medical Informatics Assoc..

[34]  Olga Brazhnik,et al.  Anatomy of data integration , 2007, J. Biomed. Informatics.

[35]  Matthias Lange,et al.  SEMEDA: ontology based semantic integration of biological databases , 2003, Bioinform..

[36]  Nicola Guarino,et al.  Formal ontology, conceptual analysis and knowledge representation , 1995, Int. J. Hum. Comput. Stud..

[37]  Olivier Bodenreider,et al.  Investigating subsumption in SNOMED CT: An exploration into large description logic-based biomedical terminologies , 2007, Artif. Intell. Medicine.

[38]  Neide Santos,et al.  Applying ontologies in the integration of heterogeneous relational databases , 2005 .

[39]  P. Stenson,et al.  Human Gene Mutation Database (HGMD®): 2003 update , 2003, Human mutation.

[40]  Anand Kumar,et al.  Text mining and ontologies in biomedicine: Making sense of raw text , 2005, Briefings Bioinform..

[41]  J. Cimino Review Paper: Coding Systems in Health Care , 1995, Methods of Information in Medicine.

[42]  Miguel García-Remesal,et al.  ARMEDA II: supporting genomic medicine through the integration of medical and genetic databases , 2004, Proceedings. Fourth IEEE Symposium on Bioinformatics and Bioengineering.

[43]  Werner Ceusters,et al.  Ontological Theory for Ontological Engineering: Biomedical Systems Information Integration , 2004, KR.

[44]  Olivier Bodenreider,et al.  Bio-ontologies: current trends and future directions , 2006, Briefings Bioinform..

[45]  Yoshihiro Yamanishi,et al.  KEGG for linking genomes to life and the environment , 2007, Nucleic Acids Res..

[46]  Mariano Fernández-López,et al.  Ontological Engineering , 2003, Encyclopedia of Database Systems.

[47]  Munindar P. Singh,et al.  Readings in agents , 1997 .

[48]  Olivier Bodenreider,et al.  Automatic Methods for Integrating Biomedical Data Sources in a Mediator-Based System , 2008, DILS.

[49]  Elizabeth M. Smigielski,et al.  dbSNP: the NCBI database of genetic variation , 2001, Nucleic Acids Res..

[50]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[51]  M. Olivier A haplotype map of the human genome. , 2003, Nature.

[52]  A Burgun,et al.  Accessing and Integrating Data and Knowledge for Biomedical Research , 2008, Yearbook of Medical Informatics.

[53]  Alan L. Rector,et al.  Ontological and Practical Issues in Using a Description Logic to Represent Medical Concept Systems: Experience from GALEN , 2006, Reasoning Web.

[54]  Craig A. Knoblock,et al.  Query processing in the SIMS information mediator , 1997 .

[55]  Eric Y H Chen,et al.  An association study of RGS4 polymorphisms with clinical phenotypes of schizophrenia in a Chinese population , 2008, American journal of medical genetics. Part B, Neuropsychiatric genetics : the official publication of the International Society of Psychiatric Genetics.

[56]  K. Furie,et al.  An evidence‐based causative classification system for acute ischemic stroke , 2005, Annals of neurology.

[57]  Kiyoko F. Aoki-Kinoshita,et al.  From genomics to chemical genomics: new developments in KEGG , 2005, Nucleic Acids Res..

[58]  Ian Horrocks,et al.  Practical Reasoning for Expressive Description Logics , 1999, LPAR.

[59]  Luciano Serafini,et al.  ConTeXtualized local ontology specification via CTXML , 2002, AAAI 2002.

[60]  Nicolaas J. I. Mars,et al.  Bottom-Up Construction of Ontologies , 1998, IEEE Trans. Knowl. Data Eng..

[61]  Sean Bechhofer,et al.  OWL: Web Ontology Language , 2009, Encyclopedia of Database Systems.

[62]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[63]  Patrick Brézillon,et al.  Lecture Notes in Artificial Intelligence , 1999 .

[64]  Amit P. Sheth,et al.  An ontology-driven semantic mashup of gene and biological pathway information: Application to the domain of nicotine dependence , 2008, J. Biomed. Informatics.

[65]  J. Bard,et al.  Ontologies in biology: design, applications and future challenges , 2004, Nature Reviews Genetics.

[66]  Stefano Spaccapietra,et al.  Modeling the Evolution of Objects in Temporal Information Systems , 2006, SEBD.

[67]  Domenico Beneventano,et al.  Consistency Checking in Complex Object Database Schemata with Integrity Constraints , 1995, DBPL.

[68]  Christian S. Jensen,et al.  Temporal Data Management , 1999, IEEE Trans. Knowl. Data Eng..

[69]  K. Becker,et al.  The Genetic Association Database , 2004, Nature Genetics.

[70]  Erhard Rahm,et al.  A survey of approaches to automatic schema matching , 2001, The VLDB Journal.

[71]  Amedeo Napoli,et al.  Ontology-guided data preparation for discovering genotype-phenotype relationships , 2008, BMC Bioinformatics.