Overview of the gene regulation network and the bacteria biotope tasks in BioNLP'13 shared task

BackgroundWe present the two Bacteria Track tasks of BioNLP 2013 Shared Task (ST): Gene Regulation Network (GRN) and Bacteria Biotope (BB). These tasks were previously introduced in the 2011 BioNLP-ST Bacteria Track as Bacteria Gene Interaction (BI) and Bacteria Biotope (BB). The Bacteria Track was motivated by a need to develop specific BioNLP tools for fine-grained event extraction in bacteria biology. The 2013 tasks expand on the 2011 version by better addressing the biological knowledge modeling needs. New evaluation metrics were designed for the new goals. Moving beyond a list of gene interactions, the goal of the GRN task is to build a gene regulation network from the extracted gene interactions. BB'13 is dedicated to the extraction of bacteria biotopes, i.e. bacterial environmental information, as was BB'11. BB'13 extends the typology of BB'11 to a large diversity of biotopes, as defined by the OntoBiotope ontology. The detection of entities and events is tackled by distinct subtasks in order to measure the progress achieved by the participant systems since 2011.ResultsThis paper details the corpus preparations and the evaluation metrics, as well as summarizing and discussing the participant results. Five groups participated in each of the two tasks. The high diversity of the participant methods reflects the dynamism of the BioNLP research community.The highest scores for the GRN and BB'13 tasks are similar to those obtained by the participants in 2011, despite of the increase in difficulty. The high density of events in short text segments (multi-event extraction) was a difficult issue for the participating systems for both tasks. The analysis of the BB'13 results also shows that co-reference resolution and entity boundary detection remain major hindrances.ConclusionThe evaluation results suggest new research directions for the improvement and development of Information Extraction for molecular and environmental biology. The Bacteria Track tasks remain publicly open; the BioNLP-ST website provides an online evaluation service, the reference corpora and the evaluation tools.

[1]  Arzucan Özgür,et al.  Bacteria Biotope Detection, Ontology-based Normalization, and Relation Extraction using Syntactic Rules , 2013, BioNLP@ACL.

[2]  Robert Bossy,et al.  BioNLP Shared Task 2013 - An overview of the Genic Regulation Network Task , 2013, BioNLP@ACL.

[3]  Claire Nédellec,et al.  Learning Language in Logic - Genic Interaction Extraction Challenge , 2005 .

[4]  Claire Nédellec,et al.  Corpus-based extension of termino-ontology by linguistic analysis: a use case in biomedical event extraction , 2011 .

[5]  Julio Collado-Vides,et al.  Automatic reconstruction of a bacterial regulatory network using Natural Language Processing , 2007, BMC Bioinformatics.

[6]  Julio Collado-Vides,et al.  RegulonDB v8.0: omics data sets, evolutionary conservation, regulatory phrases, cross-validated gold standards and more , 2012, Nucleic Acids Res..

[7]  Philip S. Yu,et al.  A new method to measure the semantic similarity of GO terms , 2007, Bioinform..

[8]  Robert Bossy,et al.  BioNLP Shared Task - The Bacteria Track , 2012, BMC Bioinformatics.

[9]  Ralph Weischedel,et al.  PERFORMANCE MEASURES FOR INFORMATION EXTRACTION , 2007 .

[10]  Xingming Zhao,et al.  Computational Systems Biology , 2013, TheScientificWorldJournal.

[11]  Kenta Nakai,et al.  DBTBS: a database of transcriptional regulation in Bacillus subtilis containing upstream intergenic conservation information , 2007, Nucleic Acids Res..

[12]  Yue Wang,et al.  The Genia Event Extraction Shared Task, 2013 Edition - Overview , 2013, BioNLP@ACL.

[13]  Xu Han,et al.  GRO Task: Populating the Gene Regulation Ontology with events and relations , 2013, BioNLP@ACL.

[14]  Vincent Claveau IRISA participation to BioNLP-ST13: lazy-learning and information retrieval for information extraction tasks , 2013, BioNLP@ACL.

[15]  Henry Soldano,et al.  Ontology-based semantic annotation: an automatic hybrid rule-based method , 2013, BioNLP@ACL.

[16]  Jun'ichi Tsujii,et al.  New challenges for text mining: mapping between text and manually curated pathways , 2008, BMC Bioinformatics.

[17]  Philippe Bessières,et al.  Learning ontological rules to extract multiple relations of genic interactions from text , 2009, Int. J. Medical Informatics.

[18]  Lonnie R. Welch,et al.  AGRIS: the Arabidopsis Gene Regulatory Information Server, an update , 2010, Nucleic Acids Res..

[19]  Alfonso Valencia,et al.  Overview of BioCreAtIvE: critical assessment of information extraction for biology , 2005, BMC Bioinformatics.

[20]  Sampo Pyysalo,et al.  EXTRACTING BIO‐MOLECULAR EVENTS FROM LITERATURE—THE BIONLP’09 SHARED TASK , 2011, Comput. Intell..

[21]  Hidde de Jong,et al.  Modeling and Simulation of Genetic Regulatory Systems: A Literature Review , 2002, J. Comput. Biol..

[22]  Marie-Francine Moens,et al.  Detecting Relations in the Gene Regulation Network , 2013, BioNLP@ACL.

[23]  Raphael H. Michna,et al.  SubtiWiki–a database for the model organism Bacillus subtilis that links pathway, interaction and expression information , 2013, Nucleic Acids Res..

[24]  Marinka Zitnik,et al.  Extracting Gene Regulation Networks Using Linear-Chain Conditional Random Fields and Rules , 2013, BioNLP@ACL.

[25]  Cyril Grouin Building A Contrasting Taxa Extractor for Relation Identification from Assertions: BIOlogical Taxonomy & Ontology Phrase Extraction System , 2013, BioNLP@ACL.

[26]  Alan R. Aronson,et al.  An overview of MetaMap: historical perspective and recent advances , 2010, J. Am. Medical Informatics Assoc..

[27]  Sampo Pyysalo,et al.  Overview of BioNLP Shared Task 2013 , 2013, BioNLP@ACL.

[28]  Robert Bossy,et al.  AlvisAE: a collaborative Web text annotation editor for knowledge acquisition , 2012, LAW@ACL.

[29]  Tapio Salakoski,et al.  EVEX in ST’13: Application of a large-scale text mining resource to event extraction and network construction , 2013, BioNLP@ACL.

[30]  Robert Bossy,et al.  BioNLP shared Task 2013 – An Overview of the Bacteria Biotope Task , 2013, BioNLP@ACL.

[31]  Robert Bossy,et al.  Improving term extraction with linguistic analysis in the biomedical domain , 2013, Res. Comput. Sci..

[32]  Jari Björne,et al.  TEES 2.1: Automated Annotation Scheme Learning in the BioNLP 2013 Shared Task , 2013, BioNLP@ACL.

[33]  Markus J. Herrgård,et al.  Reconstruction of microbial transcriptional regulatory networks. , 2004, Current opinion in biotechnology.

[34]  Anita Burgun-Parenthoine,et al.  GO2PUB: Querying PubMed with semantic expansion of gene ontology terms , 2012, Journal of Biomedical Semantics.

[35]  Carina Silberer,et al.  Proceedings of the International Conference on Language Resources and Evaluation (LREC) , 2008 .

[36]  Olivier Galibert,et al.  Named and Specific Entity Detection in Varied Data: The Quæro Named Entity Baseline Evaluation , 2010, LREC.

[37]  Zorana Ratkovic,et al.  Event extraction of bacteria biotopes: a knowledge-intensive NLP-based approach , 2012, BMC Bioinformatics.

[38]  J. Dworkin,et al.  Recent progress in Bacillus subtilis sporulation. , 2012, FEMS microbiology reviews.