Populating Biomedical Ontologies from Natural Language Texts

Ontology population is a knowledge acquisition activity that relies on (semi-) automatic methods to transform unstructured, semi-structured and structured data sources into instance data. In this work, a semantic-role based process for ontology population is presented that provides a suitable framework for textual knowledge acquisition in the biological domain. In particular, with our approach, a given ontology can be enriched by adding instances gathered from biological natural language texts. Our system’s modular architecture provides a greater versatility than current approaches in the mentioned domain, as the process of ontology population is not directly dependent on the linguistic rules developed from the corpus.

[1]  Suzanna E Lewis,et al.  Gene Ontology: looking backwards and forwards , 2004, Genome Biology.

[2]  Hae-Chang Rim,et al.  Biomedical named entity recognition using two-phase model based on SVMs , 2004, J. Biomed. Informatics.

[3]  Dieter Fensel,et al.  Knowledge Engineering: Principles and Methods , 1998, Data Knowl. Eng..

[4]  Daniel L. Rubin,et al.  Biomedical ontologies: a functional perspective , 2007, Briefings Bioinform..

[5]  Barbara Rosario,et al.  Classifying Semantic Relations in Bioscience Texts , 2004, ACL.

[6]  Estela Saquete Boró,et al.  Combining semantic information in question answering systems , 2011, Inf. Process. Manag..

[7]  Teruyoshi Hishiki,et al.  Extraction of Gene-Disease Relations from Medline Using Domain Dictionaries and Machine Learning , 2005, Pacific Symposium on Biocomputing.

[8]  A. Rector,et al.  Relations in biomedical ontologies , 2005, Genome Biology.

[9]  Daniel Gildea,et al.  The Proposition Bank: An Annotated Corpus of Semantic Roles , 2005, CL.

[10]  Xiaofen He,et al.  A protocol for constructing a domain-specific ontology for use in biomedical information extraction using lexical-chaining analysis , 2007 .

[11]  Lukasz Kurgan,et al.  xGENIA: A comprehensive OWL ontology based on the GENIA corpus , 2007, Bioinformation.

[12]  Nathalie Aussenac-Gilles,et al.  The TERMINAE Method and Platform for Ontology Engineering from Texts , 2008, Ontology Learning and Population.

[13]  Carol Friedman,et al.  Bio-Ontology and text: bridging the modeling gap , 2006, Bioinform..

[14]  Sophia Ananiadou,et al.  Text Mining for Biology And Biomedicine , 2005 .

[15]  Seth Kulick,et al.  Integrated Annotation for Biomedical Information Extraction , 2004, HLT-NAACL 2004.

[16]  Jan Scheffczyk,et al.  BioFrameNet: A Domain-Specific FrameNet Extension with Links to Biomedical Ontologies , 2006, KR-MED.

[17]  Georgios Paliouras,et al.  Ontology Population and Enrichment: State of the Art , 2011, Knowledge-Driven Multimedia Information Extraction and Ontology Evolution.

[18]  Carole A. Goble,et al.  Learning domain ontologies for semantic Web service descriptions , 2005, J. Web Semant..

[19]  Philipp Cimiano,et al.  Ontology Learning from Text: Methods, Evaluation and Applications , 2005 .

[20]  Jean-Luc Minel,et al.  Document annotation and ontology population from linguistic extractions , 2005, K-CAP '05.

[21]  Steffen Staab,et al.  Ontology Learning for the Semantic Web , 2002, IEEE Intell. Syst..

[22]  Rafael Valencia-García,et al.  A knowledge acquisition methodology to ontology construction for information retrieval from medical documents , 2008, Expert Syst. J. Knowl. Eng..

[23]  David Sánchez,et al.  Learning non-taxonomic relationships from web documents for domain ontology construction , 2008, Data Knowl. Eng..

[24]  Lawrence Hunter,et al.  Enrichment of OBO ontologies , 2007, J. Biomed. Informatics.

[25]  Olatz Ansa,et al.  Enriching very large ontologies using the WWW , 2000, ECAI Workshop on Ontology Learning.

[26]  Steffen Staab,et al.  Learning Taxonomic Relations from Heterogeneous Sources of Evidence , 2005 .

[27]  Ah-Hwee Tan,et al.  Learning and inferencing in user ontology for personalized Semantic Web search , 2009, Inf. Sci..

[28]  Jian Su,et al.  Effective Adaptation of Hidden Markov Model-based Named Entity Recognizer for Biomedical Domain , 2003, BioNLP@ACL.

[29]  Elena Beisswanger,et al.  BioTop: An upper domain ontology for the life sciencesA description of its current structure, contents and interfaces to OBO ontologies , 2008, Appl. Ontology.

[30]  Hans-Peter Kriegel,et al.  Extraction of semantic biomedical relations from text using conditional random fields , 2008, BMC Bioinformatics.

[31]  Bijan Parsia,et al.  Pellet: An OWL DL Reasoner , 2004, Description Logics.

[32]  Lluís Padró,et al.  FreeLing 1.3: Syntactic and semantic services in an open-source NLP library , 2006, LREC.

[33]  T. Takagi,et al.  Toward information extraction: identifying protein names from biological papers. , 1998, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.