SCALEUS: Semantic Web Services Integration for Biomedical Applications

In recent years, we have witnessed an explosion of biological data resulting largely from the demands of life science research. The vast majority of these data are freely available via diverse bioinformatics platforms, including relational databases and conventional keyword search applications. This type of approach has achieved great results in the last few years, but proved to be unfeasible when information needs to be combined or shared among different and scattered sources. During recent years, many of these data distribution challenges have been solved with the adoption of semantic web. Despite the evident benefits of this technology, its adoption introduced new challenges related with the migration process, from existent systems to the semantic level. To facilitate this transition, we have developed Scaleus, a semantic web migration tool that can be deployed on top of traditional systems in order to bring knowledge, inference rules, and query federation to the existent data. Targeted at the biomedical domain, this web-based platform offers, in a single package, straightforward data integration and semantic web services that help developers and researchers in the creation process of new semantically enhanced information systems. SCALEUS is available as open source at http://bioinformatics-ua.github.io/scaleus/.

[1]  C. Thermes,et al.  Ten years of next-generation sequencing technology. , 2014, Trends in genetics : TIG.

[2]  Orri Erling,et al.  Virtuoso, a Hybrid RDBMS/Graph Column Store , 2012, IEEE Data Eng. Bull..

[3]  Seán O'Riain,et al.  Querying Heterogeneous Datasets on the Linked Data Web: Challenges, Approaches, and Trends , 2012, IEEE Internet Computing.

[4]  F. Dhombres,et al.  Representation of rare diseases in health information systems: The orphanet approach to serve a wide range of end users , 2012, Human mutation.

[5]  Dietrich Rebholz-Schuhmann,et al.  The semantic web in translational medicine: current applications and future directions , 2013, Briefings Bioinform..

[6]  Cathy H. Wu,et al.  The Universal Protein Resource (UniProt) , 2006, Nucleic Acids Research.

[7]  Andrew M. Jenkinson,et al.  The EBI RDF platform: linked open data for the life sciences , 2014, Bioinform..

[8]  James A. Hendler,et al.  The Semantic Web" in Scientific American , 2001 .

[9]  David Gomez-Cabrero,et al.  Data integration in the era of omics: current and future challenges , 2014, BMC Systems Biology.

[10]  Khalil Drira,et al.  A Semantic Big Data Platform for Integrating Heterogeneous Wearable Data in Healthcare , 2015, Journal of Medical Systems.

[11]  Morris A. Swertz,et al.  SORTA: a system for ontology-based re-coding and technical annotation of biomedical phenotype data , 2015, Database J. Biol. Databases Curation.

[12]  Tim Berners-Lee,et al.  Linked Data - The Story So Far , 2009, Int. J. Semantic Web Inf. Syst..

[13]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[14]  Erik Schultes,et al.  The FAIR Guiding Principles for scientific data management and stewardship , 2016, Scientific Data.

[15]  Christopher G. Chute,et al.  Using Semantic Web Technologies for Cohort Identification from Electronic Health Records for Clinical Research , 2012, AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science.

[16]  Rachael P. Huntley,et al.  The GOA database in 2009—an integrated Gene Ontology Annotation resource , 2008, Nucleic Acids Res..

[17]  Murat M. Tanik,et al.  A System for Building Clinical Research Applications using Semantic Web-Based Approach , 2012, Journal of Medical Systems.

[18]  José Luís Oliveira,et al.  Egas: a collaborative and interactive document curation platform , 2014, Database J. Biol. Databases Curation.

[19]  E. Birney,et al.  The International Protein Index: An integrated database for proteomics experiments , 2004, Proteomics.

[20]  José Luís Oliveira,et al.  COEUS: “semantic web in a box” for biomedical applications , 2012, Journal of Biomedical Semantics.

[21]  Christian Bizer,et al.  D2R Server - Publishing Relational Databases on the Semantic Web , 2004 .

[22]  Guillaume Blin,et al.  A survey of RDF storage approaches , 2012, ARIMA J..

[23]  Stuart Weibel,et al.  The Dublin Core: A Simple Content Description Model for Electronic Resources , 2005 .

[24]  R. Doyle The American terrorist. , 2001, Scientific American.

[25]  Frank van Harmelen,et al.  Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema , 2002, SEMWEB.

[26]  Marco Sacco,et al.  A survey of RDF store solutions , 2014, 2014 International Conference on Engineering, Technology and Innovation (ICE).

[27]  Thomas B. Passin,et al.  Explorer's guide to the semantic web , 2004 .

[28]  Nicole Tourigny,et al.  Bio2RDF: Towards a mashup to build bioinformatics knowledge systems , 2008, J. Biomed. Informatics.

[29]  P. Robinson,et al.  RD-Connect: An Integrated Platform Connecting Databases, Registries, Biobanks and Clinical Bioinformatics for Rare Disease Research , 2014, Journal of General Internal Medicine.

[30]  G. Neri,et al.  The ring 14 syndrome. , 2012, European journal of medical genetics.

[31]  José Luís Oliveira,et al.  A knowledge federation architecture for rare disease patient registries and biobanks , 2016 .

[32]  Charles L. Forgy,et al.  Rete: a fast algorithm for the many pattern/many object pattern match problem , 1991 .