From Questions to Effective Answers: On the Utility of Knowledge-Driven Querying Systems for Life Sciences Data

We compare two distinct approaches for querying data in the context of the life sciences. The first approach utilizes conventional databases to store the data and provides intuitive form-based interfaces to facilitate querying of the data, commonly used by the life science researchers that we study. The second approach utilizes a large OWL ontology and the same datasets associated as RDF instances of the ontology. Both approaches are being used in parallel by a team of cell biologists in their daily research activities, with the objective of gradually replacing the conventional approach with the knowledgedriven one. We describe several benefits of the knowledge-driven approach in comparison to the traditional one, and highlight a few limitations. We believe that our analysis not only explicitly highlights the benefits and limitations of semantic Web technologies in the context of life sciences but also contributes toward effective ways of translating a question in a researcher’s mind into precise queries with the intent of obtaining effective answers.

[1]  Barbara Rosario,et al.  Classifying Semantic Relations in Bioscience Texts , 2004, ACL.

[2]  Mark Klein,et al.  Semantic Process Retrieval with iSPARQL , 2007, ESWC.

[3]  C. Bizer,et al.  D2R MAP - A Database to RDF Mapping Language , 2003, WWW.

[4]  Abraham Bernstein,et al.  Querying the Semantic Web with Ginseng: A Guided Input Natural Language Search Engine , 2009 .

[5]  Alfonso Valencia,et al.  Evaluation of BioCreAtIvE assessment of task 2 , 2005, BMC Bioinformatics.

[6]  Adrian Paschke,et al.  A journey to Semantic Web query federation in the life sciences , 2009, BMC Bioinformatics.

[7]  Amit P. Sheth,et al.  A Semantic Problem Solving Environment for Integrative Parasite Research: Identification of Intervention Targets for Trypanosoma cruzi , 2012, PLoS neglected tropical diseases.

[8]  Amit P. Sheth,et al.  TcruziKB: Enabling Complex Queries for Genomic Data Exploration , 2008, 2008 IEEE International Conference on Semantic Computing.

[9]  Bernard De Baets,et al.  BioGateway: a semantic systems biology tool for the life sciences , 2009, BMC Bioinformatics.

[10]  Amit P. Sheth,et al.  Unsupervised Discovery of Compound Entities for Relationship Extraction , 2008, EKAW.

[11]  Amit P. Sheth,et al.  Peer-to-Peer Discovery of Semantic Associations , 2005, P2PKM.

[12]  Nigel Shadbolt,et al.  NITELIGHT: A Graphical Tool for Semantic Query Construction , 2008 .

[13]  Abraham Bernstein,et al.  GINO - A Guided Input Natural Language Ontology Editor , 2006, SEMWEB.

[14]  Bijan Parsia,et al.  SPARQL-DL: SPARQL Query for OWL-DL , 2007, OWLED.

[15]  Hangwi Tang,et al.  Googling for a diagnosis—use of Google as a diagnostic aid: internet based study , 2006, BMJ : British Medical Journal.

[16]  Marcus Hutter,et al.  Bayesian DNA copy number analysis , 2009, BMC Bioinformatics.

[17]  Marti A. Hearst,et al.  TREC 2007 Genomics Track Overview , 2007, TREC.

[18]  A. Bernstein,et al.  Ginseng : A Guided Input Natural Language Search Engine for Querying Ontologies , 2006 .

[19]  Michael Hackenberg,et al.  ContDist: a tool for the analysis of quantitative gene and promoter properties , 2009, BMC Bioinformatics.

[20]  Michael Schroeder,et al.  GoWeb: a semantic search engine for the life science web , 2009, BMC Bioinformatics.

[21]  Amit P. Sheth,et al.  Ρ-Queries: enabling querying for semantic associations on the semantic web , 2003, WWW '03.

[22]  J. B. Brooke,et al.  SUS: A 'Quick and Dirty' Usability Scale , 1996 .