A Framework for Mining Life Sciences Data on the Semantic Web in an Interactive, Graph-Based Environment

The last decade saw the marked increase in the availability of the Life Sciences data on the Semantic Web. At the same time, the need to interactively explore complex and extensive biological datasets lead to development of advanced visualisation tools, many of which present the data in the form of a network graph. Semantic Web technologies offer both a means to define rich semantics necessary to describe complex biological systems and allow large amounts of data to be shared effectively. However, at present the need to be familiar with relevant technologies greatly impedes access to these datasets by the non-specialist Life Sciences researches. To address this, we have developed a software frame-work that facilitates both access to the resources and presents the data returned in an intuitive, graph-based format. Our framework is closely integrated with Ondex, an established data integration solution in the Life Sciences domain. The implementation consists of two parts. The first one is a query console that allows expert users to execute Semantic Web queries directly. The second one is a graph-based interactive browsing solution that can be used to launch stock queries by choosing items in the menu. In both cases, the result is re-formatted and visualised as a graph in Ondex frontend.

[1]  Christopher J. O. Baker,et al.  Towards Interoperable BioNLP Semantic Web Services Using the SADI Framework , 2013, DILS.

[2]  Tim Berners-Lee,et al.  Linked Data - The Story So Far , 2009, Int. J. Semantic Web Inf. Syst..

[3]  Carole A. Goble,et al.  State of the nation in data integration for bioinformatics , 2008, J. Biomed. Informatics.

[4]  Nicole Tourigny,et al.  Bio2RDF: Towards a mashup to build bioinformatics knowledge systems , 2008, J. Biomed. Informatics.

[5]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[6]  William J R Longabaugh,et al.  BioTapestry: a tool to visualize the dynamic properties of gene regulatory networks. , 2012, Methods in molecular biology.

[7]  Carole A. Goble,et al.  myExperiment: a repository and social network for the sharing of bioinformatics workflows , 2010, Nucleic Acids Res..

[8]  Philipp Frischmuth,et al.  RDFauthor: Employing RDFa for Collaborative Knowledge Engineering , 2010, EKAW.

[9]  Matthew M. Hindle,et al.  Leveraging SADI Semantic Web Services to Exploit Fish Ecotoxicology Data , 2011, CSWS.

[10]  Roy T. Fielding,et al.  Uniform Resource Identifier (URI): Generic Syntax , 2005, RFC.

[11]  Martin Kuiper,et al.  OLSVis: an animated, interactive visual browser for bio-ontologies , 2011, BMC Bioinformatics.

[12]  Bob J. Wielinga,et al.  Using Triples for Implementation: The Triple20 Ontology-Manipulation Tool , 2005, International Semantic Web Conference.

[13]  Christopher J. Rawlings,et al.  Lost in translation: data integration tools meet the Semantic Web (experiences from the Ondex project) , 2011, ICDE 2012.

[14]  Andrea Splendiani,et al.  RDFScape: Semantic Web meets Systems Biology , 2008, BMC Bioinformatics.

[15]  Cathy H. Wu,et al.  UniProt: the Universal Protein knowledgebase , 2004, Nucleic Acids Res..

[16]  José Francisco Aldana Montes,et al.  Sharing and executing linked data queries in a collaborative environment , 2013, Bioinform..

[17]  Christopher J. Rawlings,et al.  Graph-based analysis and visualization of experimental results with ONDEX , 2006, Bioinform..

[18]  Ora Lassila,et al.  W3c resource description framework (rdf) model and syntax specification , 1998 .

[19]  Jungwon Yoon,et al.  The Arabidopsis Information Resource (TAIR): a model organism database providing a centralized, curated gateway to Arabidopsis biology, research materials and community , 2003, Nucleic Acids Res..

[20]  Mark D. Wilkinson,et al.  SADI, SHARE, and the in silico scientific method , 2010, BMC Bioinformatics.

[21]  Gary D. Bader,et al.  Pathway Commons, a web resource for biological pathway data , 2010, Nucleic Acids Res..

[22]  Trey Ideker,et al.  Cytoscape 2.8: new features for data integration and network visualization , 2010, Bioinform..

[23]  Neil A. Ernst,et al.  Jambalaya: Interactive visualization to enhance ontology authoring and knowledge acquisition in Protégé , 2001 .

[24]  Christopher J. Rawlings,et al.  The OXL format for the exchange of integrated datasets , 2007, J. Integr. Bioinform..

[25]  Steffen Lohmann,et al.  Interactive Relationship Discovery via the Semantic Web , 2010, ESWC.

[26]  Christopher G. Chute,et al.  BioPortal: ontologies and integrated data resources at the click of a mouse , 2009, Nucleic Acids Res..

[27]  Alessio Bosca,et al.  OntoSphere3D: A Multidimensional Visualization Tool for Ontologies , 2006, 17th International Workshop on Database and Expert Systems Applications (DEXA'06).

[28]  Ford Lumban Gaol Recent Progress in Data Engineering and Internet Technology , 2012 .

[29]  Eivind Hovig,et al.  The semantic web and biology. , 2002, Drug discovery today.

[30]  Reinhard Schneider,et al.  A survey of visualization tools for biological network analysis , 2008, BioData Mining.

[31]  Emmanuel Pietriga,et al.  OntoTrix: a hybrid visualization for populated ontologies , 2011, WWW.

[32]  Monica M. C. Schraefel,et al.  Connecting the Dots: A Multi-pivot Approach to Data Exploration , 2011, SEMWEB.

[33]  Harith Alani TGVizTab: An Ontology Visualisation Extension for Protégé , 2003 .

[34]  Amit P. Sheth,et al.  RDF data exploration and visualization , 2007, CIMS '07.