Data Integration in Genomic Medicine: Trends and Applications

OBJECTIVES In a near future, each person will incorporate his/her own sequenced genome in his/her electronic health record. In that precise moment, genomic medicine will be fundamental for clinical practice, as an essential key of personalized medicine. All the genomic data, as well as other 'omics' and clinical data necessary for personalized medicine, are stored in several distributed databases. Research and patient care require each time more biomedical data integration of several distributed heterogeneous datasources. METHODS This work develops a comprehensive review of the most relevant works in biomedical data integration, specifically in genomic medical data, analyzing the evolution of architecture and integration techniques during the last 20 years, and its usage. CONCLUSION Most of these solutions, based on cross-linking, data warehouse or federated approaches, are suitable for specific domains. However, none of the models found in the literature is completely appropriate for a general biomedical data integration problem.

[1]  Hilla Peretz,et al.  Ju n 20 03 Schrödinger ’ s Cat : The rules of engagement , 2003 .

[2]  D. Lipman,et al.  National Center for Biotechnology Information , 2019, Springer Reference Medizin.

[3]  O Ritter,et al.  Prototype implementation of the integrated genomic database. , 1994, Computers and biomedical research, an international journal.

[4]  Peter Buneman,et al.  Challenges in Integrating Biological Data Sources , 1995, J. Comput. Biol..

[5]  P. Argos,et al.  SRS: information retrieval system for molecular biology data banks. , 1996, Methods in enzymology.

[6]  Limsoon Wong,et al.  BioKleisli: a digital library for biomedical researchers , 1997, International Journal on Digital Libraries.

[7]  N Williams,et al.  How to Get Databases Talking the Same Language , 1997, Science.

[8]  Hideo Matsuda,et al.  Implementing an integrated system for heterogeneous molecular biology databases with intelligent agents , 1997, 1997 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing, PACRIM. 10 Years Networking the Pacific Rim, 1987-1997.

[9]  Perry L. Miller,et al.  Application of Technology: Managing Attribute-Value Clinical Trials Data Using the ACT/DB Client-Server Database System , 1998, J. Am. Medical Informatics Assoc..

[10]  Carole A. Goble,et al.  TAMBIS: Transparent Access to Multiple Bioinformatics Information Sources , 1998, ISMB.

[11]  L Wong,et al.  Development of software tools at BioInformatics Centre (BIC) at the National University of Singapore (NUS). , 1998, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[12]  S. Chung,et al.  Kleisli: a new tool for data integration in biology. , 1999, Trends in biotechnology.

[13]  Terence Critchlow,et al.  DataFoundry: information management for scientific data , 2000, IEEE Transactions on Information Technology in Biomedicine.

[14]  Rolf Apweiler,et al.  The EBI SRS Server: Recent Developments , 2002, German Conference on Bioinformatics.

[15]  Anthony Kosky,et al.  Extending traditional query-based integration approaches for functional characterization of post-genomic data , 2001, Bioinform..

[16]  Sean R. Eddy,et al.  The Distributed Annotation System , 2001, BMC Bioinformatics.

[17]  Shengli Wu,et al.  GIMS-a data warehouse for storage and analysis of genome sequence and functional data , 2001, Proceedings 2nd Annual IEEE International Symposium on Bioinformatics and Bioengineering (BIBE 2001).

[18]  Bertram Ludäscher,et al.  Model-based mediation with domain maps , 2001, Proceedings 17th International Conference on Data Engineering.

[19]  Alon Y. Halevy,et al.  A model for data integration systems of biomedical data applied to online genetic databases , 2001, AMIA.

[20]  Laura M. Haas,et al.  DiscoveryLink: A system for integrated access to life sciences data sources , 2001, IBM Syst. J..

[21]  Joshua M. Stuart,et al.  Integrating genotype and phenotype information: an overview of the PharmGKB project , 2001, The Pharmacogenomics Journal.

[22]  Pedro Mendes,et al.  ISYS: a decentralized, component-based approach to the integration of heterogeneous bioinformatics resources , 2001, Bioinform..

[23]  Walter V. Sujansky,et al.  Heterogeneous Database Integration in Biomedicine , 2001, J. Biomed. Informatics.

[24]  Yury V. Bukhman,et al.  BioMolQuest: integrated database-based retrieval of protein structural and functional information , 2001, Bioinform..

[25]  L. Wong,et al.  Technologies for Integrating Biological Data , 2002, Briefings Bioinform..

[26]  Richard A. Baldock,et al.  A Multi-agent Bioinformatics Integration System with Adjustable Autonomy , 2002, PRICAI.

[27]  Matthias Lange,et al.  SEMEDA: ontology based semantic integration of biological databases , 2003, Bioinform..

[28]  Carlos Alberto Heuser,et al.  Integrating Biological Databases , 2003, SBBD.

[29]  P J Kersey,et al.  Integr8: Enhanced Inter-Operability of European Molecular Biology Databases , 2003, Methods of Information in Medicine.

[30]  Jacob Köhler,et al.  Integration of life science databases , 2004 .

[31]  Richard A. Baldock,et al.  Bioinformatics integration and agent technology , 2004, J. Biomed. Informatics.

[32]  Stephan Philippi Light-weight integration of molecular biological databases , 2004, Bioinform..

[33]  J. Wenny Rahayu,et al.  Genome Database Integration , 2004, ICCSA.

[34]  José Luís Oliveira,et al.  DiseaseCard: A Web-Based Tool for the Collaborative Integration of Genetic and Medical Information , 2004, ISBMDA.

[35]  Perry L. Miller,et al.  Model Formulation: QIS: A Framework for Biomedical Database Federation , 2004, J. Am. Medical Informatics Assoc..

[36]  Tao Xu,et al.  Atlas – a data warehouse for integrative bioinformatics , 2005, BMC Bioinformatics.

[37]  Joyce A. Mitchell,et al.  The BioMediator System as a Data Integration Tool to Answer Diverse Biologic Queries , 2004, MedInfo.

[38]  Subbarao Kambhampati,et al.  Integration of biological sources: current systems and challenges ahead , 2004, SGMD.

[39]  E. Birney,et al.  EnsMart: a generic system for fast and flexible access to biological data. , 2003, Genome research.

[40]  Priyanka Gupta,et al.  BioWarehouse: a bioinformatics database warehouse toolkit , 2006, BMC Bioinformatics.

[41]  Ingmar Reuter,et al.  Integr8 and Genome Reviews: integrated views of complete genomes and proteomes , 2004, Nucleic Acids Res..

[42]  Jacob Köhler,et al.  Addressing the problems with life-science databases for traditional uses and systems biology , 2006, Nature Reviews Genetics.

[43]  Christopher J. Rawlings,et al.  Graph-based analysis and visualization of experimental results with ONDEX , 2006, Bioinform..

[44]  José Luís Oliveira,et al.  Integrating Medical and Genomic Data: a Sucessful Example for Rare Diseases , 2006, MIE.

[45]  Golan Yona,et al.  BIOZON: a hub of heterogeneous biological data , 2006, Nucleic Acids Res..

[46]  Andreas Kaps,et al.  The BioRS(TM) Integration and Retrieval System: An open system for distributed data integration , 2006, J. Integr. Bioinform..

[47]  Alon Y. Halevy,et al.  Data integration and genomic medicine , 2007, J. Biomed. Informatics.

[48]  Kei-Hoi Cheung,et al.  AlzPharm: integration of neurodegeneration data using RDF , 2007, BMC Bioinformatics.

[49]  Kei-Hoi Cheung,et al.  Advancing translational research with the Semantic Web , 2007, BMC Bioinformatics.

[50]  M. Scott Marshall,et al.  A semantic web approach applied to integrative bioinformatics experimentation: a biological use case with genomics data , 2007, Bioinform..

[51]  Kei-Hoi Cheung,et al.  LinkHub: a Semantic Web system that facilitates cross-database queries and information retrieval in proteomics , 2007, BMC Bioinformatics.

[52]  Olga Brazhnik,et al.  Anatomy of data integration , 2007, J. Biomed. Informatics.

[53]  Richard H. Lathrop,et al.  Heterogeneous Biomedical Database Integration Using a Hybrid Strategy: A p53 Cantcer Research Database , 2006, Cancer informatics.

[54]  H Billhardt,et al.  An agent- and ontology-based system for integrating public gene, protein, and disease databases , 2007, J. Biomed. Informatics.

[55]  Christian Lovis,et al.  DebugIT for Patient Safety - Improving the Treatment with Antibiotics through Multimedia Data Mining of Heterogeneous Clinical Data , 2008, MIE.

[56]  Ralf Hofestädt,et al.  BioDWH: A Data Warehouse Kit for Life Science Data Integration , 2008, J. Integr. Bioinform..

[57]  Kei-Hoi Cheung,et al.  HCLS 2.0/3.0: Health care and life sciences data mashup using Web 2.0/3.0 , 2008, J. Biomed. Informatics.

[58]  Kei-Hoi Cheung,et al.  Semantic mashup of biomedical data , 2008, J. Biomed. Informatics.

[59]  Carole A. Goble,et al.  State of the nation in data integration for bioinformatics , 2008, J. Biomed. Informatics.

[60]  Li Gong,et al.  PharmGKB: An Integrated Resource of Pharmacogenomic Data and Knowledge , 2008, Current protocols in bioinformatics.

[61]  Stephan Philippi Data and knowledge integration in the life sciences , 2008, Briefings Bioinform..

[62]  A Burgun,et al.  Accessing and Integrating Data and Knowledge for Biomedical Research , 2008, Yearbook of Medical Informatics.

[63]  Damian Smedley,et al.  BioMart – biological queries made easy , 2009, BMC Genomics.

[64]  Kei-Hoi Cheung,et al.  Bringing Web 2.0 to bioinformatics , 2008, Briefings Bioinform..

[65]  Ming Yi,et al.  bioDBnet: the biological database network , 2009, Bioinform..

[66]  Patrick Lambrix,et al.  Information Integration in Bioinformatics with Ontologies and Standards , 2009, REWERSE.

[67]  Prakash M. Nadkarni,et al.  Model Formulation: Automated Database Mediation Using Ontological Metadata Mappings , 2009, J. Am. Medical Informatics Assoc..

[68]  José Francisco Aldana Montes,et al.  KA-SB: from data integration to large scale reasoning , 2009, BMC Bioinformatics.

[69]  Hua Min,et al.  Integration of prostate cancer clinical data using an ontology , 2009, J. Biomed. Informatics.

[70]  José Luís Oliveira,et al.  GeNS: a Biological Data Integration Platform , 2009 .

[71]  Astakhov,et al.  Biomedical Informatics , 2009, Methods in Molecular Biology™.

[72]  Martin Kuiper,et al.  Biological knowledge management: the emerging role of the Semantic Web technologies , 2009, Briefings Bioinform..

[73]  Allam Appa Rao,et al.  Techniques for integrating ‐omics data , 2009, Bioinformation.

[74]  Adrian Paschke,et al.  A journey to Semantic Web query federation in the life sciences , 2009, BMC Bioinformatics.

[75]  Jordi Villà-Freixa,et al.  Knowledge management for systems biology a general and visually driven framework applied to translational medicine , 2011, BMC Systems Biology.

[76]  Jun Gao,et al.  DW4TR: A Data Warehouse for Translational Research , 2011, J. Biomed. Informatics.

[77]  Michelle D. Brazas,et al.  The 2011 bioinformatics links directory update: more resources, tools and databases and features to empower the bioinformatics community , 2011, Nucleic Acids Res..

[78]  Michelle Whirl-Carrillo,et al.  From pharmacogenomic knowledge acquisition to clinical applications: the PharmGKB as a clinical pharmacogenomic biomarker resource. , 2011, Biomarkers in medicine.