An up-to-date knowledge-based literature search and exploration framework for focused bioscience domains

In domain-specific search systems, knowledge of a domain of interest is embedded as a backbone that guides the search process. But the knowledge used in most such systems 1. exists only for few well known broad domains; 2. is of a basic nature: either purely hierarchical or involves only few relationship types; and 3. is not always kept up-to-date missing insights from recently published results. In this paper we present a framework and implementation of a focused and up-to-date knowledge-based search system, called Scooner, that utilizes domain-specific knowledge extracted from recent bioscience abstracts. To our knowledge, this is the first attempt in the field to address all three shortcomings mentioned above. Since recent introduction for operational use at Applied Biotechnology Branch of AFRL, some biologists are using Scooner on a regular basis, while it is being made available for use by many more. Initial evaluations point to the promise of the approach in addressing the challenge we set out to address.

[1]  Pavel Velikhov,et al.  Accuracy estimate and optimization techniques for SimRank computation , 2008, The VLDB Journal.

[2]  Jennifer Widom,et al.  SimRank: a measure of structural-context similarity , 2002, KDD.

[3]  Amit P. Sheth,et al.  Joint Extraction of Compound Entities and Relationships from Biomedical Literature , 2008, 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[4]  O Bodenreider,et al.  Biomedical ontologies in action: role in knowledge management, data integration and decision support. , 2008, Yearbook of medical informatics.

[5]  Yasunori Yamamoto,et al.  Biomedical knowledge navigation by literature clustering , 2007, J. Biomed. Informatics.

[6]  Christopher D. Manning,et al.  Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[7]  Dimitra Alexopoulou,et al.  GoPubMed: Exploring Pubmed with Ontological Background Knowledge , 2008, Ontologies and Text Mining for Life Sciences.

[8]  Gene Ontology Consortium The Gene Ontology (GO) database and informatics resource , 2003 .

[9]  Zhiyong Lu,et al.  PubMed and beyond: a survey of web tools for searching biomedical literature , 2011, Database J. Biol. Databases Curation.

[10]  Amit P. Sheth,et al.  Automatic Domain Model Creation Using Pattern-Based Fact Extraction , 2011 .

[11]  Amit P. Sheth,et al.  Growing Fields of Interest - Using an Expand and Reduce Strategy for Domain Model Extraction , 2008, 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[12]  William R. Hersh,et al.  TREC GENOMICS Track Overview , 2003, TREC.

[13]  Hyla H. Polen,et al.  Scope, Completeness, and Accuracy of Drug Information in Wikipedia , 2008, The Annals of pharmacotherapy.

[14]  Kei-Hoi Cheung,et al.  Advancing translational research with the Semantic Web , 2007, BMC Bioinformatics.

[15]  Daniel S. Weld,et al.  Open Information Extraction Using Wikipedia , 2010, ACL.

[16]  Marti A. Hearst,et al.  TREC 2007 Genomics Track Overview , 2007, TREC.

[17]  P Bork,et al.  XplorMed: a tool for exploring MEDLINE abstracts. , 2001, Trends in biochemical sciences.

[18]  Doug Downey,et al.  Web-scale information extraction in knowitall: (preliminary results) , 2004, WWW '04.

[19]  Ben Shneiderman,et al.  The healthcare singularity and the age of semantic medicine , 2009, The Fourth Paradigm.

[20]  Peter D. Turney Expressing Implicit Semantic Relations without Supervision , 2006, ACL.

[21]  Marti A. Hearst Automatic Acquisition of Hyponyms from Large Text Corpora , 1992, COLING.

[22]  Michaël,et al.  Seeking health information online: does Wikipedia matter? , 2009, Journal of the American Medical Informatics Association : JAMIA.

[23]  D. Swanson Migraine and Magnesium: Eleven Neglected Connections , 2015, Perspectives in biology and medicine.

[24]  Ulf Leser,et al.  Simple tricks for improving pattern-based information extraction from the biomedical literature , 2010, J. Biomed. Semant..

[25]  Luis Gravano,et al.  Snowball: extracting relations from large plain-text collections , 2000, DL '00.