A Survey of Semantic Integration Approaches in Bioinformatics

Technological advances of computer science and data analysis are helping to provide continuously huge volumes of biological data, which are available on the web. Such advances involve and require powerful techniques for data integration to extract pertinent knowledge and information for a specific question. Biomedical exploration of these big data often requires the use of complex queries across multiple autonomous, heterogeneous and distributed data sources. Semantic integration is an active area of research in several disciplines, such as databases, information-integration, and ontology. We provide a survey of some approaches and techniques for integrating biological data, we focus on those developed in the ontology community. Keywords—Semantic data integration, biological ontology, linked data, semantic web, OWL, RDF.

[1]  Martin Hofmann-Apitius,et al.  CSEO – the Cigarette Smoke Exposure Ontology , 2014, J. Biomed. Semant..

[2]  Núria Queralt-Rosinach,et al.  The Semanticscience Integrated Ontology (SIO) for biomedical research and knowledge discovery , 2014, J. Biomed. Semant..

[3]  Paul N. Schofield,et al.  The Units Ontology: a tool for integrating units of measurement in science , 2012, Database J. Biol. Databases Curation.

[4]  Xiaowei Wang,et al.  A domain ontology for the Non-Coding RNA field , 2015, 2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[5]  Olivier Curé,et al.  On The Potential Integration of an Ontology-Based Data Access Approach in NoSQL Stores , 2013, Int. J. Distributed Syst. Technol..

[6]  Michael Darsow,et al.  ChEBI: a database and ontology for chemical entities of biological interest , 2007, Nucleic Acids Res..

[7]  Ulf Leser,et al.  SoFIA: a data integration framework for annotating high-throughput datasets , 2016, Bioinform..

[8]  Christian Bizer,et al.  Evolving the Web into a Global Data Space , 2011, BNCOD.

[9]  George A. Vouros,et al.  Ontology-Based Data Integration for Event Recognition in the Maritime Domain , 2015, WIMS.

[10]  Priyanka Gupta,et al.  BioWarehouse: a bioinformatics database warehouse toolkit , 2006, BMC Bioinformatics.

[11]  Antje Chang,et al.  The BRENDA Tissue Ontology (BTO): the first all-integrating ontology of all organisms for enzyme sources , 2010, Nucleic Acids Res..

[12]  Mark A. Musen,et al.  Building a biomedical ontology recommender web service , 2010, J. Biomed. Semant..

[13]  Dietrich Rebholz-Schuhmann,et al.  The semantic web in translational medicine: current applications and future directions , 2013, Briefings Bioinform..

[14]  Judith A. Blake,et al.  Beyond the data deluge: Data integration and bio-ontologies , 2006, J. Biomed. Informatics.

[15]  Stefan Deßloch,et al.  Towards generating ETL processes for incremental loading , 2008, IDEAS '08.

[16]  Jessica A. Turner,et al.  Modeling biomedical experimental processes with OBI , 2010, J. Biomed. Semant..

[17]  Carole A. Goble,et al.  The Taverna workflow suite: designing and executing workflows of Web Services on the desktop, web or in the cloud , 2013, Nucleic Acids Res..

[18]  E. Prud hommeaux,et al.  SPARQL query language for RDF , 2011 .

[19]  Robert Stevens,et al.  Ten Simple Rules for Selecting a Bio-ontology , 2016, PLoS Comput. Biol..

[20]  Olivier Curé,et al.  Data Integration over NoSQL Stores Using Access Path Based Mappings , 2011, DEXA.

[21]  Clement Jonquet,et al.  AgroPortal : a proposition for ontology-based services in the agronomic domain , 2015 .

[22]  Thomas R. Gruber,et al.  A translation approach to portable ontology specifications , 1993, Knowl. Acquis..

[23]  Heiner Stuckenschmidt,et al.  Ontology-Based Integration of Information - A Survey of Existing Approaches , 2001, OIS@IJCAI.

[24]  Boris Motik,et al.  OWL 2: The next step for OWL , 2008, J. Web Semant..

[25]  Michael D. Iannacone,et al.  Developing an Ontology for Cyber Security Knowledge Graphs , 2015, CISR.

[26]  Wendy Hall,et al.  The Semantic Web Revisited , 2006, IEEE Intelligent Systems.

[27]  Marco Masseroli,et al.  Integration and Querying of Genomic and Proteomic Semantic Annotations for Biomedical Knowledge Extraction , 2016, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[28]  Arek Kasprzyk,et al.  BioMart: driving a paradigm change in biological data management , 2011, Database J. Biol. Databases Curation.

[29]  Alain Bouju,et al.  A Semantic Mediator for Handling Heterogeneity of Spatio-Temporal Environment Data , 2015, MTSR.

[30]  Carole A. Goble,et al.  Why Linked Data is Not Enough for Scientists , 2010, 2010 IEEE Sixth International Conference on e-Science.

[31]  Carole A. Goble,et al.  State of the nation in data integration for bioinformatics , 2008, J. Biomed. Informatics.

[32]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[33]  J. Bard,et al.  Ontologies in biology: design, applications and future challenges , 2004, Nature Reviews Genetics.

[34]  Krzysztof Fujarewicz,et al.  Integrated System Supporting Research on Environment Related Cancers , 2016 .

[35]  Paul N. Schofield,et al.  The role of ontologies in biological and biomedical research: a functional perspective , 2015, Briefings Bioinform..

[36]  Prashant Doshi,et al.  A framework for ontology-based question answering with application to parasite immunology , 2015, Journal of Biomedical Semantics.

[37]  Jian Zhang,et al.  The Protein Ontology: a structured representation of protein forms and complexes , 2010, Nucleic Acids Res..

[38]  Ulf Leser,et al.  Integrating and Warehousing Liver Gene Expression Data and Related Biomedical Resources in GEDAW , 2005, DILS.

[39]  Bipin C. Desai Proceedings of the 2008 international symposium on Database engineering & applications , 2000, IDEAS 2008.

[40]  Thomas Steinke,et al.  Columba: an integrated database of proteins, structures, and annotations , 2005, BMC Bioinformatics.

[41]  Hao Wang,et al.  Semantic data mining: A survey of ontology-based approaches , 2015, Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015).

[42]  Sandra Orchard,et al.  Molecular interaction databases , 2012, Proteomics.

[43]  Lawrence Hunter,et al.  KaBOB: ontology-based semantic integration of biomedical databases , 2015, BMC Bioinformatics.

[44]  Daniel J. Blankenberg,et al.  Galaxy: a platform for interactive large-scale genome analysis. , 2005, Genome research.

[45]  Daniel Rios,et al.  Bioinformatics Applications Note Databases and Ontologies Deriving the Consequences of Genomic Variants with the Ensembl Api and Snp Effect Predictor , 2022 .

[46]  Yongqun He,et al.  Ontodog: a web-based ontology community view generation tool , 2014, Bioinform..

[47]  Mira Kim,et al.  Integration of Big Data Using Semantic Web Technologies , 2016, 2016 IEEE Tenth International Conference on Semantic Computing (ICSC).

[48]  Luisa Montecchi-Palazzi,et al.  The PSI-MOD community standard for representation of protein modification data , 2008, Nature Biotechnology.