Improving search through Event-based Biomedical Text Mining

there has been a major focus on event extraction for biomedical applications. In this paper, we focus on search, highlight some of the drawbacks of popular search methods, and show how event extraction and associated technologies, e.g., named entity recognition, can help to improve the efficiency of search. We also explore how event extraction can be enhanced through a new type of annotation, i.e. meta-knowledge annotation, which can facilitate the extraction of high-level information relating to the intended interpretation of events, e.g. whether they represent a hypothesis, a claim, a belief, an opinion, a well established fact, a tentative or more confident analysis of experimental results, etc.

[1]  Alfred D. Eaton,et al.  HubMed: a web-based biomedical literature search interface , 2006, Nucleic Acids Res..

[2]  Josef Ruppenhofer,et al.  FrameNet II: Extended theory and practice , 2006 .

[3]  Sophia Ananiadou,et al.  How to make the most of NE dictionaries in statistical NER , 2008, BMC Bioinformatics.

[4]  János Csirik,et al.  The BioScope corpus: biomedical texts annotated for uncertainty, negation and their scopes , 2008, BMC Bioinformatics.

[5]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[6]  Preslav Nakov,et al.  BioText Search Engine: beyond abstract search , 2007, Bioinform..

[7]  Carol Friedman,et al.  Introduction: named entity recognition in biomedicine , 2004, J. Biomed. Informatics.

[8]  Sophia Ananiadou,et al.  Learning string similarity measures for gene/protein name dictionary look-up using logistic regression , 2007, Bioinform..

[9]  Jari Björne,et al.  Complex event extraction at PubMed scale , 2010, Bioinform..

[10]  Jun'ichi Tsujii,et al.  Event Extraction with Complex Event Classification Using Rich Features , 2010, J. Bioinform. Comput. Biol..

[11]  Wen-Lian Hsu,et al.  BIOSMILE: A semantic role labeling system for biomedical verbs using a maximum-entropy model with automatically generated template features , 2007, BMC Bioinformatics.

[12]  Sophia Ananiadou,et al.  Automatic Terminology Management in Biomedicine , 2006 .

[13]  Steve Pettifer,et al.  Defrosting the Digital Library: Bibliographic Tools for the Next Generation Web , 2008, PLoS Comput. Biol..

[14]  Alexander A. Morgan,et al.  Overview of BioCreAtIvE task 1B: normalized gene lists , 2005, BMC Bioinformatics.

[15]  Sampo Pyysalo,et al.  Overview of BioNLP’09 Shared Task on Event Extraction , 2009, BioNLP@HLT-NAACL.

[16]  Nigel Collier,et al.  PASBio: predicate-argument structures for event extraction in molecular biology , 2004, BMC Bioinformatics.

[17]  Olivier Bodenreider,et al.  Assessing the consistency of a biomedical terminology through lexical knowledge , 2002, Int. J. Medical Informatics.

[18]  Dietrich Rebholz-Schuhmann,et al.  EBIMed - text crunching to gather facts for proteins from Medline , 2007, Bioinform..

[19]  Hagit Shatkay,et al.  New directions in biomedical text annotation: definitions, guidelines and corpus construction , 2006, BMC Bioinformatics.

[20]  Marti A. Hearst Search User Interfaces , 2009 .

[21]  David J. States,et al.  MiSearch adaptive pubMed search tool , 2009, Bioinform..

[22]  Allen C. Browne,et al.  UMLS language and vocabulary tools. , 2003, AMIA ... Annual Symposium proceedings. AMIA Symposium.

[23]  Sophia Ananiadou,et al.  Building a Bio-Event Annotated Corpus for the Acquisition of Semantic Frames from Biomedical Corpora , 2008, LREC.

[24]  Michael Krauthammer,et al.  Term identification in the biomedical literature , 2004, J. Biomed. Informatics.

[25]  Alfonso Valencia,et al.  Implementing the iHOP concept for navigation of biomedical literature , 2005, ECCB/JBI.

[26]  Jari Björne,et al.  BioInfer: a corpus for information extraction in the biomedical domain , 2007, BMC Bioinformatics.

[27]  Junichi Tsujii,et al.  Event extraction for systems biology by text mining the literature. , 2010, Trends in biotechnology.

[28]  Baris E. Suzek,et al.  The Universal Protein Resource (UniProt) in 2010 , 2009, Nucleic Acids Res..

[29]  Jun'ichi Tsujii,et al.  Corpus-Oriented Grammar Development for Acquiring a Head-Driven Phrase Structure Grammar from the Penn Treebank , 2004, IJCNLP.

[30]  Jan Scheffczyk,et al.  BioFrameNet: A Domain-Specific FrameNet Extension with Links to Biomedical Ontologies , 2006, KR-MED.

[31]  Naoaki Okazaki,et al.  Building a high-quality sense inventory for improved abbreviation disambiguation , 2010, Bioinform..

[32]  K. Bretonnel Cohen,et al.  Frontiers of biomedical text mining: current progress , 2007, Briefings Bioinform..

[33]  Paul A. Fontelo,et al.  Technical development of PubMed Interact: an improved interface for MEDLINE/PubMed searches , 2006, BMC Medical Informatics Decis. Mak..

[34]  Hongfang Liu,et al.  BioThesaurus: a web-based thesaurus of protein and gene names , 2006, Bioinform..

[35]  Dietrich Rebholz-Schuhmann,et al.  Text processing through Web services: calling Whatizit , 2008, Bioinform..

[36]  Daniel Gildea,et al.  The Proposition Bank: An Annotated Corpus of Semantic Roles , 2005, CL.

[37]  Jun'ichi Tsujii,et al.  Feature Forest Models for Probabilistic HPSG Parsing , 2008, CL.

[38]  Martha Palmer,et al.  Verbnet: a broad-coverage, comprehensive verb lexicon , 2005 .

[39]  Sophia Ananiadou,et al.  Construction of an annotated corpus to support biomedical information extraction , 2009, BMC Bioinformatics.

[40]  Sophia Ananiadou,et al.  Text mining and its potential applications in systems biology. , 2006, Trends in biotechnology.

[41]  Sophia Ananiadou,et al.  Normalizing biomedical terms by minimizing ambiguity and variability , 2008, BMC Bioinformatics.

[42]  S. Hunston,et al.  Evaluation in Text , 2006 .

[43]  Jun'ichi Tsujii,et al.  Semantic Retrieval for the Accurate Identification of Relational Concepts in Massive Textbases , 2006, ACL.

[44]  Sophia Ananiadou,et al.  Meta-Knowledge Annotation of Bio-Events , 2010, LREC.

[45]  María Martín,et al.  The Universal Protein Resource (UniProt) in 2010 , 2010 .

[46]  James Pustejovsky,et al.  Biomedical term mapping databases , 2004, Nucleic Acids Res..

[47]  Naoaki Okazaki,et al.  Kleio: a knowledge-enriched information retrieval system for biology , 2008, SIGIR '08.

[48]  Jun'ichi Tsujii,et al.  Corpus annotation for mining biomedical events from literature , 2008, BMC Bioinformatics.

[49]  Matthew E Falagas,et al.  Comparison of PubMed, Scopus, Web of Science, and Google Scholar: strengths and weaknesses , 2007, FASEB journal : official publication of the Federation of American Societies for Experimental Biology.

[50]  Sophia Ananiadou,et al.  Bootstrapping a Verb Lexicon for Biomedical Information Extraction , 2009, CICLing.

[51]  Dietrich Rebholz-Schuhmann,et al.  BioLexicon: A Lexical Resource for the Biology Domain , 2008, SMBM 2008.