Semantic Data Integration on Biomedical Data Using Semantic Web Technologies

Contemporary life sciences research requires an understanding of systems across wide ranges of scale and distribution. Therefore, there is an urgent need to integrate biomedical knowledge generated by different communities and separate subfields (Shadbolt et al., 2006). Scientific publications and curated databases together hold a vast amount of this useable knowledge. Additionally the number, size, and complexity of life science databases continues to grow (Kei-Hoi et al., 2009). Therefore scientists in the field of genomics, proteomics, metabolomics, clinical medicine and drug discovery need a concept to integrate their data, (Shadbolt et al., 2006) which is a prominent problem (Kei-Hoi et al., 2009). But to generate such a uniform data integration concept there are still some challenges to overcome such as handling the variety and amount of available data, inconsistency with data heterogeneity from the different sources, the autonomy and differing capabilities of the sources and a lack of standards for such an integration concept. Many heterogeneity conflicts remain in data integration due to the lack of semantics (Gagnon, 2007). In order, to efficiently exploit the knowledge from different resources, it will be important to connect the sources in a manner that machine processes can traverse and intelligently identify these links (Neumann et al., 2004). A promising approach to integrate heterogeneous data sources could be the use of Semantic Web technologies. They provide a framework to deal with the afore mentioned problems and fulfil the requirements for machine processing. This book chapter provides an overview of data integration on biomedical data using Semantic Web technologies including existing techniques (standards, specifications and methods), challenges, approaches and projects.

[1]  Ian Horrocks,et al.  FaCT++ Description Logic Reasoner: System Description , 2006, IJCAR.

[2]  Heiner Stuckenschmidt,et al.  Ontology-Based Integration of Information - A Survey of Existing Approaches , 2001, OIS@IJCAI.

[3]  Kei-Hoi Cheung,et al.  LinkHub: a Semantic Web system that facilitates cross-database queries and information retrieval in proteomics , 2007, BMC Bioinformatics.

[4]  Andrea Calì,et al.  Accessing Data Integration Systems through Conceptual Schemas , 2001, ER.

[5]  York Sure-Vetter,et al.  Ontology Mapping - An Integrated Approach , 2004, ESWS.

[6]  Carole A. Goble,et al.  State of the nation in data integration for bioinformatics , 2008, J. Biomed. Informatics.

[7]  Rachael P. Huntley,et al.  The Gene Ontology Annotation (GOA) Database , 2009 .

[8]  Mark Gerstein,et al.  Semantic Web Approach to Database Integration in the Life Sciences , 2007 .

[9]  Thomas R. Gruber,et al.  Toward principles for the design of ontologies used for knowledge sharing? , 1995, Int. J. Hum. Comput. Stud..

[10]  A. Valencia,et al.  Linking genes to literature: text mining, information extraction, and retrieval applications for biology , 2008, Genome Biology.

[11]  Martinez-Gil Jorge Thinking on the Web: Berners-Lee, Gödel and Turing , 2007 .

[12]  Olivier Bodenreider,et al.  Integrating the UMLS into an RDF-Based Biomedical Knowledge Repository. , 2007, AMIA ... Annual Symposium proceedings. AMIA Symposium.

[13]  Dieter Fensel,et al.  Knowledge Engineering: Principles and Methods , 1998, Data Knowl. Eng..

[14]  Sean Bechhofer,et al.  Understanding and using the meaning of statements in a bio-ontology: recasting the Gene Ontology in OWL , 2007, BMC Bioinformatics.

[15]  Wendy Hall,et al.  The Semantic Web Revisited , 2006, IEEE Intelligent Systems.

[16]  Asunción Gómez-Pérez,et al.  Six challenges for the Semantic Web , 2002, KR 2002.

[17]  Jeffrey T. Pollock Semantic Web For Dummies , 2009 .

[18]  E. Birney,et al.  The International Protein Index: An integrated database for proteomics experiments , 2004, Proteomics.

[19]  Olivier Bodenreider,et al.  Alignment of the UMLS semantic network with BioTop: methodology and assessment , 2009, Bioinform..

[20]  Michel Gagnon,et al.  Ontology-based integration of data sources , 2007, 2007 10th International Conference on Information Fusion.

[21]  Peter Haase,et al.  An evaluation of approaches to federated query processing over linked data , 2010, I-SEMANTICS '10.

[22]  Katy Börner,et al.  Semantic Association Networks: Using Semantic Web Technology to Improve Scholarly Knowledge and Expertise Management , 2006, Visualizing the Semantic Web, 2nd Edition.

[23]  Dan Brickley,et al.  FOAF Vocabulary Specification , 2004 .

[24]  Hans-Michael Müller,et al.  Textpresso for Neuroscience: Searching the Full Text of Thousands of Neuroscience Research Papers , 2008, Neuroinformatics.

[25]  Roy T. Fielding,et al.  Uniform Resource Identifier (URI): Generic Syntax , 2005, RFC.

[26]  Kei-Hoi Cheung,et al.  Semantic Web: Revolutionizing Knowledge Discovery in the Life Sciences , 2006 .

[27]  Daniel Oberle,et al.  Implementing views for light-weight Web ontologies , 2003, Seventh International Database Engineering and Applications Symposium, 2003. Proceedings..

[28]  Anand Kumar,et al.  Text mining and ontologies in biomedicine: Making sense of raw text , 2005, Briefings Bioinform..

[29]  Hyoil Han,et al.  A survey on ontology mapping , 2006, SGMD.

[30]  Elena Beisswanger,et al.  BioTop: An upper domain ontology for the life sciencesA description of its current structure, contents and interfaces to OBO ontologies , 2008, Appl. Ontology.

[31]  Vipul Kashyap,et al.  Representing the UMLS Semantic Network Using OWL: (Or "What's in a Semantic Web Link?") , 2003, SEMWEB.

[32]  L. Grivell,et al.  Text mining for biology - the way forward: opinions from leading scientists , 2008, Genome Biology.

[33]  Peter Buneman,et al.  Challenges in Integrating Biological Data Sources , 1995, J. Comput. Biol..

[34]  Andrea Calì,et al.  On the Expressive Power of Data Integration Systems , 2002, ER.

[35]  A.-C. Boury-Brisset,et al.  Ontology-based approach for information fusion , 2003, Sixth International Conference of Information Fusion, 2003. Proceedings of the.

[36]  Dimitra Alexopoulou,et al.  Terminologies for text-mining; an experiment in the lipoprotein metabolism domain , 2008, BMC Bioinformatics.

[37]  Olivier Bodenreider,et al.  Ontologies and Data Integration in Biomedicine: Success Stories and Challenging Issues , 2008, DILS.

[38]  Amit P. Sheth,et al.  Semantic interoperability in global information systems , 1999, SGMD.

[39]  Paola Velardi,et al.  Evaluation of OntoLearn, a Methodology for Automatic Learning of Domain Ontologies , 2005 .

[40]  Eric K. Neumann,et al.  What the semantic web could do for the life sciences , 2004 .

[41]  Yarden Katz,et al.  Pellet: A practical OWL-DL reasoner , 2007, J. Web Semant..

[42]  Sophia Ananiadou,et al.  Text mining and its potential applications in systems biology. , 2006, Trends in biotechnology.

[43]  Rachael P. Huntley,et al.  The GOA database in 2009—an integrated Gene Ontology Annotation resource , 2008, Nucleic Acids Res..

[44]  Maurizio Vincini,et al.  Synthesizing an Integrated Ontology , 2003, IEEE Internet Comput..

[45]  Carlos Alberto Heuser,et al.  Integrating Biological Databases , 2003, SBBD.

[46]  Julie Chabalier,et al.  Integrating and querying disease and pathway ontologies : building an OWL model and using RDFS queries , 2007 .

[47]  Olivier Bodenreider,et al.  The Unified Medical Language System (UMLS): integrating biomedical terminology , 2004, Nucleic Acids Res..

[48]  Miguel García-Remesal,et al.  ONTOFUSION: Ontology-based integration of genomic and clinical databases , 2006, Comput. Biol. Medicine.

[49]  Bénédicte Le Grand,et al.  Visualisation of the Semantic Web: Topic Maps visualisation , 2002, Proceedings Sixth International Conference on Information Visualisation.

[50]  Paola Velardi,et al.  Learning Domain Ontologies from Document Warehouses and Dedicated Web Sites , 2004, CL.

[51]  M. Ashburner,et al.  The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration , 2007, Nature Biotechnology.

[52]  A. Rector,et al.  Relations in biomedical ontologies , 2005, Genome Biology.