BioOntoVerb: A top level ontology based framework to populate biomedical ontologies from texts

The Semantic Web can be conceived as an extension of the current Web where information is given well-defined meaning. In this scenario ontologies are crucial since they provide meaning and facilitate the search for contents and information. Ontology population is a knowledge acquisition activity used to transform data sources into instance data. The instantiation of ontologies with new knowledge is an important step towards the provision of valuable ontology-based services. In this paper, we present a methodology to be used for ontology population. For it, top level ontologies that define the basic semantic relations in biomedical domains are mapped onto semantic role labelling resources, where every semantic role defines the role of a verbal argument in the event expressed by the verb. The modular architecture employed in our work gives the system a high versatility, as resources have been developed separately and they can be easily adapted to most biomedical domain ontologies.

[1]  Jun'ichi Tsujii,et al.  Part-of-Speech Annotation of Biology Research Abstracts , 2004, LREC.

[2]  Sougata Mukherjea,et al.  Discovering semantic biomedical relations utilizing the Web , 2008, TKDD.

[3]  Ángel García-Crespo,et al.  Semantic model for knowledge representation in e-business , 2011, Knowl. Based Syst..

[4]  James A. Hendler,et al.  The Semantic Web" in Scientific American , 2001 .

[5]  Sophia Ananiadou,et al.  Developing a Robust Part-of-Speech Tagger for Biomedical Text , 2005, Panhellenic Conference on Informatics.

[6]  Daniel L. Rubin,et al.  Biomedical ontologies: a functional perspective , 2007, Briefings Bioinform..

[7]  Rafael Muñoz,et al.  Combining automatic acquisition of knowledge with machine learning approaches for multilingual temporal recognition and normalization , 2008, Inf. Sci..

[8]  Hui Yang,et al.  A Verb-Centric Approach for Relationship Extraction in Biomedical Text , 2010, 2010 IEEE Fourth International Conference on Semantic Computing.

[9]  T. Takagi,et al.  Toward information extraction: identifying protein names from biological papers. , 1998, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[10]  Michael J. Cafarella,et al.  Ontology-driven, unsupervised instance population , 2008, J. Web Semant..

[11]  Georgios Paliouras,et al.  Ontology Population and Enrichment: State of the Art , 2011, Knowledge-Driven Multimedia Information Extraction and Ontology Evolution.

[12]  Carole A. Goble,et al.  Learning domain ontologies for semantic Web service descriptions , 2005, J. Web Semant..

[13]  Philipp Cimiano,et al.  Ontology Learning from Text: Methods, Evaluation and Applications , 2005 .

[14]  Elena Beisswanger,et al.  BioTop: An upper domain ontology for the life sciencesA description of its current structure, contents and interfaces to OBO ontologies , 2008, Appl. Ontology.

[15]  Hans-Peter Kriegel,et al.  Extraction of semantic biomedical relations from text using conditional random fields , 2008, BMC Bioinformatics.

[16]  Pierluigi Ritrovato,et al.  Advanced ontology management system for personalised e-Learning , 2009, Knowl. Based Syst..

[17]  Jian Su,et al.  Effective Adaptation of Hidden Markov Model-based Named Entity Recognizer for Biomedical Domain , 2003, BioNLP@ACL.

[18]  Rafael Valencia-García,et al.  A knowledge acquisition methodology to ontology construction for information retrieval from medical documents , 2008, Expert Syst. J. Knowl. Eng..

[19]  Maria Vargas-Vera,et al.  Ontosophie: A Semi-Automatic System for Ontology Population from Text , 2004 .

[20]  Mark Craven,et al.  Representing Sentence Structure in Hidden Markov Models for Information Extraction , 2001, IJCAI.

[21]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[22]  Steffen Staab,et al.  Ontology Learning for the Semantic Web , 2002, IEEE Intell. Syst..

[23]  John B. Lowe,et al.  The Berkeley FrameNet Project , 1998, ACL.

[24]  Xiaofen He,et al.  A protocol for constructing a domain-specific ontology for use in biomedical information extraction using lexical-chaining analysis , 2007 .

[25]  Lukasz Kurgan,et al.  xGENIA: A comprehensive OWL ontology based on the GENIA corpus , 2007, Bioinformation.

[26]  Nathalie Aussenac-Gilles,et al.  The TERMINAE Method and Platform for Ontology Engineering from Texts , 2008, Ontology Learning and Population.

[27]  Wen-Lian Hsu,et al.  BIOSMILE: A semantic role labeling system for biomedical verbs using a maximum-entropy model with automatically generated template features , 2007, BMC Bioinformatics.

[28]  Carol Friedman,et al.  Two biomedical sublanguages: a description based on the theories of Zellig Harris , 2002, J. Biomed. Informatics.

[29]  Dieter Fensel,et al.  Knowledge Engineering: Principles and Methods , 1998, Data Knowl. Eng..

[30]  Wim Peters,et al.  SPRAT : a tool for automatic semantic pattern-based ontology population , 2009 .

[31]  Rafael Valencia-García,et al.  Populating Biomedical Ontologies from Natural Language Texts , 2010, KEOD.

[32]  Estela Saquete Boró,et al.  Combining semantic information in question answering systems , 2011, Inf. Process. Manag..

[33]  Yarden Katz,et al.  Pellet: A practical OWL-DL reasoner , 2007, J. Web Semant..

[34]  Ah-Hwee Tan,et al.  Learning and inferencing in user ontology for personalized Semantic Web search , 2009, Inf. Sci..

[35]  Martha Palmer,et al.  From TreeBank to PropBank , 2002, LREC.

[36]  Barbara Rosario,et al.  Classifying Semantic Relations in Bioscience Texts , 2004, ACL.

[37]  Paul Buitelaar,et al.  Ontology Learning from Text: An Overview , 2005 .

[38]  Paola Velardi,et al.  Enriching a Formal Ontology with a Thesaurus: an Application in the Cultural Heritage Domain , 2006, OntologyLearning@COLING/ACL.

[39]  Elena García Barriocanal,et al.  Applying an ontology approach to IT service management for business-IT integration , 2012, Knowl. Based Syst..

[40]  Bernardo Magnini,et al.  Weakly Supervised Approaches for Ontology Population , 2008, EACL.

[41]  A. Rector,et al.  Relations in biomedical ontologies , 2005, Genome Biology.

[42]  Jean-Luc Minel,et al.  Document annotation and ontology population from linguistic extractions , 2005, K-CAP '05.

[43]  Juana María Ruiz-Martínez,et al.  ONTOLOGY POPULATION : AN APPLICATION FOR THE E-TOURISM DOMAIN , 2011 .

[44]  Martha Palmer,et al.  Verbnet: a broad-coverage, comprehensive verb lexicon , 2005 .

[45]  Claudio Giuliano,et al.  Instance Based Lexical Entailment for Ontology Population , 2007, EMNLP-CoNLL.

[46]  Carol Friedman,et al.  Bio-Ontology and text: bridging the modeling gap , 2006, Bioinform..

[47]  Xia Wang,et al.  Decision support in e-business based on assessing similarities between ontologies , 2012, Knowl. Based Syst..

[48]  C. Fillmore FRAME SEMANTICS AND THE NATURE OF LANGUAGE * , 1976 .

[49]  Suzanna E Lewis,et al.  Gene Ontology: looking backwards and forwards , 2004, Genome Biology.

[50]  Hae-Chang Rim,et al.  Biomedical named entity recognition using two-phase model based on SVMs , 2004, J. Biomed. Informatics.

[51]  Anand Kumar,et al.  Text mining and ontologies in biomedicine: Making sense of raw text , 2005, Briefings Bioinform..

[52]  Gerhard Friedrich,et al.  Automated ontology instantiation from tabular web sources - The AllRight system , 2009, J. Web Semant..

[53]  Heru Agus Santoso,et al.  Ontology extraction from relational database: Concept hierarchy as background knowledge , 2011, Knowl. Based Syst..

[54]  Teruyoshi Hishiki,et al.  Extraction of Gene-Disease Relations from Medline Using Domain Dictionaries and Machine Learning , 2005, Pacific Symposium on Biocomputing.

[55]  Seth Kulick,et al.  Integrated Annotation for Biomedical Information Extraction , 2004, HLT-NAACL 2004.

[56]  Burr Settles,et al.  Biomedical Named Entity Recognition using Conditional Random Fields and Rich Feature Sets , 2004, NLPBA/BioNLP.

[57]  Jan Scheffczyk,et al.  BioFrameNet: A Domain-Specific FrameNet Extension with Links to Biomedical Ontologies , 2006, KR-MED.

[58]  Nigel Collier,et al.  PASBio: predicate-argument structures for event extraction in molecular biology , 2004, BMC Bioinformatics.

[59]  Sophia Ananiadou,et al.  Text Mining for Biology And Biomedicine , 2005 .

[60]  Emanuele Pianta,et al.  Ontology Population from Textual Mentions: Task Definition and Benchmark , 2006, OntologyLearning@COLING/ACL.

[61]  Boris Motik,et al.  Hypertableau Reasoning for Description Logics , 2009, J. Artif. Intell. Res..