Clinical Research Informatics for Big Data and Precision Medicine.

OBJECTIVES To reflect on the notable events and significant developments in Clinical Research Informatics (CRI) in the year of 2015 and discuss near-term trends impacting CRI. METHODS We selected key publications that highlight not only important recent advances in CRI but also notable events likely to have significant impact on CRI activities over the next few years or longer, and consulted the discussions in relevant scientific communities and an online living textbook for modern clinical trials. We also related the new concepts with old problems to improve the continuity of CRI research. RESULTS The highlights in CRI in 2015 include the growing adoption of electronic health records (EHR), the rapid development of regional, national, and global clinical data research networks for using EHR data to integrate scalable clinical research with clinical care and generate robust medical evidence. Data quality, integration, and fusion, data access by researchers, study transparency, results reproducibility, and infrastructure sustainability are persistent challenges. CONCLUSION The advances in Big Data Analytics and Internet technologies together with the engagement of citizens in sciences are shaping the global clinical research enterprise, which is getting more open and increasingly stakeholder-centered, where stakeholders include patients, clinicians, researchers, and sponsors.

[1]  B. Emmerich,et al.  Data quality in computerized patient records , 1994, International journal of clinical monitoring and computing.

[2]  Diane M. Strong,et al.  Beyond Accuracy: What Data Quality Means to Data Consumers , 1996, J. Manag. Inf. Syst..

[3]  John W. Glasser,et al.  Vaccine Safety Datalink project: a new tool for improving vaccine safety monitoring in the United States. The Vaccine Safety Datalink Team. , 1997, Pediatrics.

[4]  Michael M. Wagner,et al.  Review: Accuracy of Data in Computer-based Patient Records , 1997, J. Am. Medical Informatics Assoc..

[5]  V. J. Connors,et al.  Strategic plan. , 1999, Journal of the American Optometric Association.

[6]  Peter J. Haug,et al.  Research Paper: Assessing the Quality of Clinical Data in a Computer-based Record for Calculating the Pneumonia Severity Index , 2000, J. Am. Medical Informatics Assoc..

[7]  M S Pepe,et al.  Phases of biomarker development for early detection of cancer. , 2001, Journal of the National Cancer Institute.

[8]  Richard Y. Wang,et al.  Data quality assessment , 2002, Commun. ACM.

[9]  K. Thiru,et al.  Systematic review of scope and quality of electronic patient record data in primary care , 2003, BMJ : British Medical Journal.

[10]  Tom Chan,et al.  Problems with primary care data quality: osteoporosis as an exemplar. , 2004, Informatics in primary care.

[11]  P. Hayes The Ethics of Cleaning Data , 2004, Clinical nursing research.

[12]  Roger Eeckels,et al.  Data Cleaning: Detecting, Diagnosing, and Editing Data Abnormalities , 2005, PLoS medicine.

[13]  E. Zerhouni Translational and clinical science--time for a new vision. , 2005, The New England journal of medicine.

[14]  Thomas K. Houston,et al.  Data Quality in the Outpatient Setting: Impact on Clinical Decision Support Systems , 2005, AMIA.

[15]  JRobert Beck,et al.  The Cancer Biomedical Informatics Grid (caBIGTM): Infrastructure and Applications for a Worldwide Research Community , 2007, MedInfo.

[16]  D. Brailer From Santa Barbara to Washington: a person's and a nation's journey toward portable health information. , 2007, Health affairs.

[17]  L. Etheredge,et al.  A rapid-learning health system. , 2007, Health affairs.

[18]  O Bodenreider,et al.  Biomedical ontologies in action: role in knowledge management, data integration and decision support. , 2008, Yearbook of medical informatics.

[19]  Carl F. Pieper,et al.  Quantifying Data Quality for Clinical Trials Using Electronic Data Capture , 2008, PloS one.

[20]  Philip R. O. Payne,et al.  Clinical research informatics: challenges, opportunities and definition for an emerging domain. , 2009, Journal of the American Medical Informatics Association : JAMIA.

[21]  P. Embí,et al.  Toward Reuse of Clinical Data for Research and Quality Improvement: The End of the Beginning? , 2009, Annals of Internal Medicine.

[22]  Paul Ohm Broken Promises of Privacy: Responding to the Surprising Failure of Anonymization , 2009 .

[23]  G. Hartvigsen,et al.  Secondary Use of EHR: Data Quality Issues and Informatics Opportunities , 2010, Summit on translational bioinformatics.

[24]  Kitty S. Chan,et al.  Review: Electronic Health Records and the Reliability and Validity of Quality Measures: A Review of the Literature , 2010, Medical care research and review : MCRR.

[25]  S. Dixon,et al.  Putting a value on the avoidance of false positive results when screening for inherited metabolic disease in the newborn , 2012, Journal of Inherited Metabolic Disease.

[26]  Jonathan Swan Error prone , 2012, Nature.

[27]  Jihoon Kim,et al.  Grid Binary LOgistic REgression (GLORE): building shared models without sharing data , 2012, J. Am. Medical Informatics Assoc..

[28]  Paul A. Harris,et al.  Designing a Public Square for Research Computing , 2012, Science Translational Medicine.

[29]  Thomas Vogt,et al.  Reinventing Discovery: The New Era of Networked Science , 2012 .

[30]  A. Moyer Handling false positives in the genomic era. , 2012, Clinical chemistry.

[31]  Herman Tse Publishing: Curb temptation to skip quality control , 2012, Nature.

[32]  Meredith Nahm,et al.  Data Quality in Clinical Research , 2012 .

[33]  Chunhua Weng,et al.  Clinical research informatics: a conceptual perspective , 2012, J. Am. Medical Informatics Assoc..

[34]  Dean F Sittig,et al.  A Survey of Informatics Platforms That Enable Distributed Comparative Effectiveness Research Using Multi-institutional Heterogenous Clinical Data , 2012, Medical care.

[35]  Kent Bottles,et al.  Will the quantified self movement take off in health care? , 2012, Physician executive.

[36]  Richard Platt,et al.  The U.S. Food and Drug Administration's Mini‐Sentinel program: status and direction , 2012, Pharmacoepidemiology and drug safety.

[37]  Peter N. Robinson,et al.  Deep phenotyping for precision medicine , 2012, Human mutation.

[38]  J. Slutsky,et al.  Building sustainable multi-functional prospective electronic clinical data systems. , 2012, Medical care.

[39]  Daniel MacArthur,et al.  Methods: Face up to false positives , 2012, Nature.

[40]  Philip R. O. Payne,et al.  Evidence generating medicine: redefining the research-practice relationship to complete the evidence cycle. , 2013, Medical care.

[41]  Melissa A. Basford,et al.  The Electronic Medical Records and Genomics (eMERGE) Network: past, present, and future , 2013, Genetics in Medicine.

[42]  Marius Fieschi,et al.  Harmonization process for the identification of medical events in eight European healthcare databases: the experience from the EU-ADR project , 2013, J. Am. Medical Informatics Assoc..

[43]  George Hripcsak,et al.  Caveats for the use of operational electronic health record data in comparative effectiveness research. , 2013, Medical care.

[44]  Julia Hirschberg,et al.  Characterization of the Biomedical Query Mediation Process , 2013, AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science.

[45]  George Hripcsak,et al.  Defining and measuring completeness of electronic health records for secondary use , 2013, J. Biomed. Informatics.

[46]  Xiaoqian Jiang,et al.  EXpectation Propagation LOgistic REgRession (EXPLORER): Distributed privacy-preserving online model learning , 2013, J. Biomed. Informatics.

[47]  S. Kingsmore Incidental Swimming with Millstones , 2013, Science Translational Medicine.

[48]  George Hripcsak,et al.  Next-generation phenotyping of electronic health records , 2012, J. Am. Medical Informatics Assoc..

[49]  Michael G. Kahn,et al.  Developing a data infrastructure for a learning health system: the PORTAL network , 2014, J. Am. Medical Informatics Assoc..

[50]  M M Hansen,et al.  Big Data in Science and Healthcare: A Review of Recent Literature and Perspectives , 2014, Yearbook of Medical Informatics.

[51]  Robert L. Grossman,et al.  Bionimbus: a cloud for managing, analyzing and sharing large genomics datasets , 2014, J. Am. Medical Informatics Assoc..

[52]  C Safran,et al.  Reuse Of Clinical Data , 2014, Yearbook of Medical Informatics.

[53]  Richard Platt,et al.  Four health data networks illustrate the potential for a shared national multipurpose big-data network. , 2014, Health affairs.

[54]  G Hripcsak,et al.  A Distribution-based Method for Assessing The Differences between Clinical Trial Target Populations and Patient Populations in Electronic Health Records , 2014, Applied Clinical Informatics.

[55]  Gilad J. Kuperman,et al.  Sustainability Considerations for Health Research and Analytic Data Infrastructures , 2014, EGEMS.

[56]  H. Krumholz Big data and new knowledge in medicine: the thinking, training, and tools needed for a learning health system. , 2014, Health affairs.

[57]  Richard Platt,et al.  Launching PCORnet, a national patient-centered clinical research network , 2014, Journal of the American Medical Informatics Association : JAMIA.

[58]  Francis S. Collins,et al.  PCORnet: turning a dream into reality , 2014, J. Am. Medical Informatics Assoc..

[59]  Michelle Dunn,et al.  The National Institutes of Health's Big Data to Knowledge (BD2K) initiative: capitalizing on biomedical big data , 2014, J. Am. Medical Informatics Assoc..

[60]  L. Lenert,et al.  EHR Big Data Deep Phenotyping , 2014, Yearbook of Medical Informatics.

[61]  C. Hammer,et al.  Accumulated environmental risk determining age at schizophrenia onset: a deep phenotyping-based study. , 2014, The lancet. Psychiatry.

[62]  Philip R. O. Payne,et al.  Advancing methodologies in Clinical Research Informatics (CRI): Foundational work for a maturing field , 2014, J. Biomed. Informatics.

[63]  I. Kohane,et al.  Finding the missing link for big biomedical data. , 2014, JAMA.

[64]  Joshua C Rubin,et al.  Weaving together a healthcare improvement tapestry. Learning health system brings together health data stakeholders to share knowledge and improve health. , 2014, Journal of AHIMA.

[65]  Roy Pardee,et al.  The HMO Research Network Virtual Data Warehouse: A Public Data Model to Support Collaboration , 2014, EGEMS.

[66]  Rainu Kaushal,et al.  Changing the research landscape: the New York City Clinical Data Research Network , 2014, J. Am. Medical Informatics Assoc..

[67]  Michael J. Becich,et al.  PaTH: towards a learning health system in the Mid-Atlantic region , 2014, Journal of the American Medical Informatics Association : JAMIA.

[68]  Gurvaneet Randhawa,et al.  Building electronic data infrastructure for comparative effectiveness research: accomplishments, lessons learned and future steps. , 2014, Journal of comparative effectiveness research.

[69]  Keith Marsolo,et al.  PEDSnet: a National Pediatric Learning Health System , 2014, J. Am. Medical Informatics Assoc..

[70]  Chunhua Weng,et al.  Toward a Cognitive Task Analysis for Biomedical Query Mediation , 2014, AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science.

[71]  Deanna M. Church,et al.  ClinVar: public archive of relationships among sequence variation and human phenotype , 2013, Nucleic Acids Res..

[72]  Xiaoqian Jiang,et al.  WebDISCO: a web service for distributed cox model learning without patient-level data sharing , 2015, J. Am. Medical Informatics Assoc..

[73]  Lucila Ohno-Machado A journal's role in resource sharing and reproducibility , 2015, J. Am. Medical Informatics Assoc..

[74]  Edgar Danielyan,et al.  ‘Cloud computing’ and clinical trials: report from an ECRIN workshop , 2015, Trials.

[75]  Patrick B. Ryan,et al.  Transparent Reporting of Data Quality in Distributed Data Networks , 2015, EGEMS.

[76]  Daniel Davis,et al.  Operationalizing the Learning Health Care System in an Integrated Delivery System , 2015, EGEMS.

[77]  Vasa Curcin,et al.  Translational Medicine and Patient Safety in Europe: TRANSFoRm—Architecture for the Learning Health System in Europe , 2015, BioMed research international.

[78]  Isaac S Kohane,et al.  Ten things we have to do to achieve precision medicine , 2015, Science.

[79]  Christopher G. Chute,et al.  Health information technology data standards get down to business: maturation within domains and the emergence of interoperability , 2015, J. Am. Medical Informatics Assoc..

[80]  Chunhua Weng,et al.  Optimizing Clinical Research Participant Selection with Informatics. , 2015, Trends in pharmacological sciences.

[81]  Bin Deng,et al.  Characterizing breast lesions through robust multimodal data fusion using independent diffuse optical and x-ray breast imaging. , 2015, Journal of biomedical optics.

[82]  Christel Daniel-Le Bozec,et al.  Using electronic health records for clinical research: The case of the EHR4CR project , 2015, J. Biomed. Informatics.

[83]  Carol A Gotway Crawford,et al.  A New Source of Data for Public Health Surveillance: Facebook Likes , 2015, Journal of medical Internet research.

[84]  Wenzhao Jia,et al.  Tattoo-based noninvasive glucose monitoring: a proof-of-concept study. , 2015, Analytical chemistry.

[85]  Brett Doble,et al.  Realising the Value of Linked Data to Health Economic Analyses of Cancer Care: A Case Study of Cancer 2015 , 2016, PharmacoEconomics.

[86]  W. Anderson Reproducibility: Stamp out shabby research conduct , 2015, Nature.

[87]  E. Emanuel Reform of Clinical Research Regulations, Finally. , 2015, The New England journal of medicine.

[88]  William W. Stead,et al.  Toward a science of learning systems: a research agenda for the high-functioning Learning Health System , 2014, J. Am. Medical Informatics Assoc..

[89]  Shuang Wang,et al.  Assessing the Collective Population Representativeness of Related Type 2 Diabetes Trials by Combining Public Data from ClinicalTrials.gov and NHANES , 2015, MedInfo.

[90]  Justin B Starren,et al.  Enabling a Learning Health System through a Unified Enterprise Data Warehouse: The Experience of the Northwestern University Clinical and Translational Sciences (NUCATS) Institute , 2015, Clinical and translational science.

[91]  Xiaoqian Jiang,et al.  Grid multi-category response logistic models , 2015, BMC Medical Informatics and Decision Making.

[92]  Cathryn M. Delude Deep phenotyping: The details of disease , 2015, Nature.

[93]  Chunhua Weng,et al.  Visual aggregate analysis of eligibility features of clinical trials , 2015, J. Biomed. Informatics.

[94]  Michel Dumontier,et al.  The center for expanded data annotation and retrieval , 2015, J. Am. Medical Informatics Assoc..

[95]  Bridget M. Kuehn,et al.  Twitter Streams Fuel Big Data Approaches to Health Forecasting. , 2015, JAMA.

[96]  Julia Adler-Milstein,et al.  Electronic Health Record Adoption In US Hospitals: Progress Continues, But Challenges Persist. , 2015, Health affairs.

[97]  Yu-Chuan Li,et al.  Observational Health Data Sciences and Informatics (OHDSI): Opportunities for Observational Researchers , 2015, MedInfo.

[98]  Jessica S. Ancker,et al.  Use of Self-Service Query Tools Varies by Experience and Research Knowledge , 2015, MedInfo.

[99]  Philip E. Bourne,et al.  The NIH Big Data to Knowledge (BD2K) initiative , 2015, J. Am. Medical Informatics Assoc..

[100]  A. Darzi,et al.  Metabolic phenotype-microRNA data fusion analysis of the systemic consequences of Roux-en-Y gastric bypass surgery , 2015, International Journal of Obesity.

[101]  Carl A. Gunter,et al.  Privacy in the Genomic Era , 2014, ACM Comput. Surv..

[102]  T. Cheng,et al.  In the Aftermath of the National Children's Study. , 2015, JAMA pediatrics.

[103]  P. Landrigan,et al.  The National Children's Study--end or new beginning? , 2015, The New England journal of medicine.

[104]  Peter Szolovits,et al.  Toward high-throughput phenotyping: unbiased automated feature extraction and selection from knowledge sources , 2015, J. Am. Medical Informatics Assoc..

[105]  Pi-Jung Hsieh,et al.  Healthcare professionals' use of health clouds: Integrating technology acceptance and status quo bias perspectives , 2015, Int. J. Medical Informatics.

[106]  Chunhua Weng,et al.  Identification of Questionable Exclusion Criteria in Mental Disorder Clinical Trials Using a Medical Encyclopedia , 2016, PSB.

[107]  Luciano Floridi,et al.  The Ethics of Big Data: Current and Foreseeable Issues in Biomedical Contexts , 2015, Science and Engineering Ethics.

[108]  Xiaoqian Jiang,et al.  VERTIcal Grid lOgistic regression (VERTIGO) , 2016, J. Am. Medical Informatics Assoc..

[109]  A. Butte,et al.  Leveraging big data to transform target selection and drug discovery , 2016, Clinical pharmacology and therapeutics.

[110]  Chunhua Weng,et al.  Facilitating biomedical researchers' interrogation of electronic health record data: Ideas from outside of biomedical informatics , 2016, J. Biomed. Informatics.

[111]  Olivier Bodenreider,et al.  The digital revolution in phenotyping , 2015, Briefings Bioinform..

[112]  Lionel Blanchet,et al.  Data Fusion in Metabolomics and Proteomics for Biomarker Discovery. , 2016, Methods in molecular biology.

[113]  Dipak Kalra,et al.  Cost-benefit assessment of using electronic health records data for clinical research versus current practices: Contribution of the Electronic Health Records for Clinical Research (EHR4CR) European Project. , 2016, Contemporary clinical trials.

[114]  Ricardo Villamarín-Salomón,et al.  ClinVar: public archive of interpretations of clinically relevant variants , 2015, Nucleic Acids Res..

[115]  Jody W. Enck,et al.  Can citizen science enhance public understanding of science? , 2016, Public understanding of science.

[116]  Chunhua Weng,et al.  DREAM: Classification scheme for dialog acts in clinical research query mediation , 2016, J. Biomed. Informatics.