Clinical Research Informatics: Contributions from 2016

Objectives: To summarize key contributions to current research in the field of Clinical Research Informatics (CRI) and to select the best papers published in 2016. Methods: A bibliographic search using a combination of MeSH and free terms on CRI was performed using PubMed, followed by a double-blind review in order to select a list of candidate best papers to be then peer-reviewed by external reviewers. A consensus meeting between the two section editors and the editorial team was organized to finally conclude on the selection of best papers. Results: Among the 452 papers published in 2016 in the various areas of CRI and returned by the query, the full review process selected four best papers. The authors of the first paper utilized a comprehensive representation of the patient medical record and semi-automatically labeled training sets to create phenotype models via a machine learning process. The second selected paper describes an open source tool chain securely connecting ResearchKit compatible applications (Apps) to the widely-used clinical research infrastructure Informatics for Integrating Biology and the Bedside (i2b2). The third selected paper describes the FAIR Guiding Principles for scientific data management and stewardship. The fourth selected paper focuses on the evaluation of the risk of privacy breaches in releasing genomics datasets. Conclusions: A major trend in the 2016 publications is the variety of research on “real-world data” healthcare-generated data, person health data, and patient-reported outcomes – highlighting the opportunities provided by new machine learning techniques as well as new potential risks of privacy breaches.

[1]  Nigam H. Shah,et al.  Learning statistical models of phenotypes using noisy labeled training data , 2016, J. Am. Medical Informatics Assoc..

[2]  Alberto Anguita,et al.  A method and software framework for enriching private biomedical sources with data from public online repositories , 2016, J. Biomed. Informatics.

[3]  Guilherme Del Fiol,et al.  Evaluating common data models for use with a longitudinal community registry , 2016, J. Biomed. Informatics.

[4]  Gloria P. Lipori,et al.  Collecting, Integrating, and Disseminating Patient-Reported Outcomes for Research in a Learning Healthcare System , 2016, EGEMS.

[5]  Liping Li,et al.  A hybrid solution for extracting structured medical information from unstructured data in medical records via a double-reading/entry system , 2016, BMC Medical Informatics and Decision Making.

[6]  Pascal B. Pfiffner,et al.  C3-PRO: Connecting ResearchKit to the Health System Using i2b2 and FHIR , 2016, PloS one.

[7]  David Sontag,et al.  Electronic medical record phenotyping using the anchor and learn framework , 2016, J. Am. Medical Informatics Assoc..

[8]  Paul A. Harris,et al.  PheKB: a catalog and workflow for creating electronic phenotype algorithms for transportability , 2016, J. Am. Medical Informatics Assoc..

[9]  M. Gerstein,et al.  Quantification of private information leakage from phenotype-genotype data: linking attacks , 2016, Nature Methods.

[10]  Kenneth D. Mandl,et al.  Data interchange using i2b2 , 2016, J. Am. Medical Informatics Assoc..

[11]  Jie Xu,et al.  Developing a data element repository to support EHR-driven phenotype algorithm authoring and execution , 2016, J. Biomed. Informatics.

[12]  T. Ganslandt,et al.  Common data elements for secondary use of electronic health record data for clinical trial execution and serious adverse event reporting , 2016, BMC Medical Research Methodology.

[13]  Casey S. Greene,et al.  Semi-supervised learning of the electronic health record for phenotype stratification , 2016, J. Biomed. Informatics.

[14]  Stuart Speedie,et al.  Application of an Ontology for Characterizing Data Quality for a Secondary Use of EHR Data , 2016, Applied Clinical Informatics.

[15]  Erik Schultes,et al.  The FAIR Guiding Principles for scientific data management and stewardship , 2016, Scientific Data.