Strategies to Access Patient Clinical Data from Distributed Databases

Over the last twenty years, the use of electronic health record systems has become widespread worldwide, leading to the creation of an extensive collection of health databases. These databases can be used to speed up and reduce the cost of health research studies, which are essential for the advance of health science and the improvement of health services. However, despite the recognised gain of data sharing, database owners remain reluctant to grant access to the contents of their databases because of privacy and security issues, and because of the lack of a common strategy for data sharing. Two main approaches have been used to perform distributed queries while maintaining all data control in the hands of the data custodians: applying a common data model, or using Semantic Web principles. This paper presents a comparison of these two approaches by evaluating them according to parameters relevant to data integration, such as cost, data quality, interoperability, extendibility, consistency, and efficiency.

[1]  Erik Schultes,et al.  The FAIR Guiding Principles for scientific data management and stewardship , 2016, Scientific Data.

[2]  Peter F. Patel-Schneider,et al.  OWL 2 Web Ontology Language Primer (Second Edition) , 2012 .

[3]  José Luís Oliveira,et al.  SCALEUS: Semantic Web Services Integration for Biomedical Applications , 2017, Journal of Medical Systems.

[4]  Csongor Nyulas,et al.  BioPortal: enhanced functionality via new Web services from the National Center for Biomedical Ontology to access and use ontologies in software applications , 2011, Nucleic Acids Res..

[5]  N. Wintfeld,et al.  The association of body mass index with the risk of type 2 diabetes: a case–control study nested in an electronic health records system in the United States , 2014, Diabetology & Metabolic Syndrome.

[6]  J. Avorn,et al.  A review of uses of health care utilization databases for epidemiologic research on therapeutics. , 2005, Journal of clinical epidemiology.

[7]  José Luís Oliveira,et al.  A Methodology for Fine-Grained Access Control in Exposing Biomedical Data , 2018, MIE.

[8]  Abdullah Alamri,et al.  Semantic Health Mediation and Access Control Manager for Interoperability Among Healthcare Systems , 2018, J. Inf. Technol. Res..

[9]  Lawrence O. Gostin,et al.  The Value, Importance, and Oversight of Health Research , 2009 .

[10]  Amrapali Zaveri,et al.  Linked Data for Life Sciences , 2017, Algorithms.

[11]  Patrick B. Ryan,et al.  Validation of a common data model for active safety surveillance research , 2012, J. Am. Medical Informatics Assoc..

[12]  Deborah H. Batson,et al.  Data model considerations for clinical effectiveness researchers. , 2012, Medical care.

[13]  Ricardo Ribeiro,et al.  A Modular Workflow Management Framework , 2018, HEALTHINF.

[14]  C. AbouZahr,et al.  Sharing health data: good intentions are not enough. , 2010, Bulletin of the World Health Organization.

[15]  Hyeon-Eui Kim,et al.  Identifying Appropriate Reference Data Models for Comparative Effectiveness Research (CER) Studies Based on Data from Clinical Information Systems , 2013, Medical care.

[16]  Christian Ohmann,et al.  Meeting the Challenges of Patient Recruitment , 2007, International Journal of Pharmaceutical Medicine.

[17]  Huajun Chen,et al.  Semantic Web meets Integrative Biology: a survey , 2013, Briefings Bioinform..

[18]  Michael R. PHILLIPS,et al.  Secondary analysis of existing data: opportunities and implementation , 2014, Shanghai archives of psychiatry.

[19]  Wendy W. Chapman,et al.  Public sharing of research datasets: A pilot study of associations , 2010, J. Informetrics.

[20]  J. Lei,et al.  Combining multiple healthcare databases for postmarketing drug and vaccine safety surveillance: why and how? , 2014, Journal of internal medicine.

[21]  Steven Kelly,et al.  PS1-46: HMORNnet: Shared Infrastructure for Distributed Querying by HMORN Collaboratives , 2012, Clinical Medicine & Research.

[22]  Iain E. Buchan,et al.  Trustworthy reuse of health data: A transnational perspective , 2013, Int. J. Medical Informatics.

[23]  José Luís Oliveira,et al.  MONTRA: An agile architecture for data publishing and discovery , 2018, Comput. Methods Programs Biomed..

[24]  José Luís Oliveira,et al.  A Methodology to Perform Semi-automatic Distributed EHR Database Queries , 2018, HEALTHINF.

[25]  Serguei V. S. Pakhomov,et al.  Electronic medical records for clinical research: application to the identification of heart failure. , 2007, The American journal of managed care.

[26]  Daniel M. Doolan,et al.  Answering Research Questions Using an Existing Data Set , 2017 .

[27]  Yu-Chuan Li,et al.  Observational Health Data Sciences and Informatics (OHDSI): Opportunities for Observational Researchers , 2015, MedInfo.

[28]  S. Reisner,et al.  Mental health of transgender youth in care at an adolescent urban community health center: a matched retrospective cohort study. , 2015, The Journal of adolescent health : official publication of the Society for Adolescent Medicine.

[29]  Roy Pardee,et al.  The HMO Research Network Virtual Data Warehouse: A Public Data Model to Support Collaboration , 2014, EGEMS.

[30]  James A. Hendler,et al.  The Semantic Web 10 , 2011 .

[31]  Pedro Lopes,et al.  Challenges and Opportunities for Exploring Patient-Level Data , 2015, BioMed research international.

[32]  Nelia Lasierra,et al.  Building a Semantic Model to Enhance the User's Perceived Functionality of the EHR , 2016, MIE.