Big biomedical data and cardiovascular disease research: opportunities and challenges.

Electronic health records (EHRs), data generated and collected during normal clinical care, are increasingly being linked and used for translational cardiovascular disease research. Electronic health record data can be structured (e.g. coded diagnoses) or unstructured (e.g. clinical notes) and increasingly encapsulate medical imaging, genomic and patient-generated information. Large-scale EHR linkages enable researchers to conduct high-resolution observational and interventional clinical research at an unprecedented scale. A significant amount of preparatory work and research, however, is required to identify, obtain, and transform raw EHR data into research-ready variables that can be statistically analysed. This study critically reviews the opportunities and challenges that EHR data present in the field of cardiovascular disease clinical research and provides a series of recommendations for advancing and facilitating EHR research.

[1]  J. Ware,et al.  Cardiovascular Safety of Varenicline: Patient-Level Meta-Analysis of Randomized, Blinded, Placebo-Controlled Trials , 2013, American journal of therapeutics.

[2]  Neil M. Richards,et al.  Big Data Ethics , 2014 .

[3]  S. Wooding,et al.  The answer is 17 years, what is the question: understanding time lags in translational research , 2011, Journal of the Royal Society of Medicine.

[4]  I. Kohane,et al.  Finding the missing link for big biomedical data. , 2014, JAMA.

[5]  Dipak Kalra,et al.  Data Resource Profile: Cardiovascular disease research using linked bespoke studies and electronic health records (CALIBER) , 2012, International journal of epidemiology.

[6]  P. Heuschmann,et al.  Supplementary Appendix , 2015 .

[7]  Hua Xu,et al.  Applying active learning to high-throughput phenotyping algorithms for electronic health records data. , 2013, Journal of the American Medical Informatics Association : JAMIA.

[8]  V. Harjola,et al.  Acute Heart Failure , 2018, Update in Intensive Care and Emergency Medicine.

[9]  J. Sowers,et al.  Diabetes and cardiovascular disease. , 1999, Diabetes care.

[10]  L. Hajibayova,et al.  CRITICAL QUESTIONS FOR BIG DATA APPROACH IN KNOWLEDGE REPRESENTATION AND ORGANIZATION , 2017 .

[11]  Robert A. Israel,et al.  International Classification of Diseases (ICD) , 2005 .

[12]  J. Scannell,et al.  Diagnosing the decline in pharmaceutical R&D efficiency , 2012, Nature Reviews Drug Discovery.

[13]  Kristian Thorlund,et al.  Cardiovascular Events Associated With Smoking Cessation Pharmacotherapies: A Network Meta-Analysis , 2014, Circulation.

[14]  Tudor I. Oprea,et al.  Temporal disease trajectories condensed from population-wide registry data covering 6.2 million patients , 2014, Nature Communications.

[15]  L. Smeeth,et al.  Pragmatic randomised trials using routine electronic health records: putting them to the test , 2012, BMJ : British Medical Journal.

[16]  Graham Thornicroft,et al.  The South London and Maudsley NHS Foundation Trust Biomedical Research Centre (SLAM BRC) case register: development and descriptive data , 2009, BMC psychiatry.

[17]  Michael J Sailor,et al.  Mesoporous silicon sponge as an anti-pulverization structure for high-performance lithium-ion battery anodes , 2014, Nature Communications.

[18]  D. Solomon,et al.  The risk of atrial fibrillation in patients with rheumatoid arthritis , 2013, Annals of the rheumatic diseases.

[19]  Melissa A. Basford,et al.  Robust replication of genotype-phenotype associations across multiple diseases in an electronic medical record. , 2010, American journal of human genetics.

[20]  Darrel P Francis,et al.  Effect on cardiovascular risk of high density lipoprotein targeted drug treatments niacin, fibrates, and CETP inhibitors: meta-analysis of randomised controlled trials including 117 411 patients , 2014, BMJ : British Medical Journal.

[21]  J. Tardif,et al.  Ivabradine in stable coronary artery disease without clinical heart failure. , 2014, The New England journal of medicine.

[22]  Spiros Denaxas,et al.  Enhancing Discoverability of Public Health and Epidemiology Research Data , 2014 .

[23]  P. Krebs,et al.  ACP Journal Club: review: varenicline for tobacco cessation does not increase CV serious adverse events. , 2012, Annals of internal medicine.

[24]  Kaspar Althoefer,et al.  Wheel/tissue force interaction: A new concept for soft tissue diagnosis during MIS , 2008, 2008 30th Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[25]  David Erlinge,et al.  Thrombus aspiration during ST-segment elevation myocardial infarction. , 2013, The New England journal of medicine.

[26]  Michelle Dunn,et al.  The National Institutes of Health's Big Data to Knowledge (BD2K) initiative: capitalizing on biomedical big data , 2014, J. Am. Medical Informatics Assoc..

[27]  J. Prochaska,et al.  Risk of cardiovascular serious adverse events associated with varenicline use for tobacco cessation: systematic review and meta-analysis , 2012, BMJ : British Medical Journal.

[28]  J. Ioannidis,et al.  Transforming Epidemiology for 21st Century Medicine and Public Health , 2013, Cancer Epidemiology, Biomarkers & Prevention.

[29]  H. Svanström,et al.  Use of varenicline for smoking cessation and risk of serious cardiovascular events: nationwide cohort study , 2012, BMJ : British Medical Journal.

[30]  L. Skov,et al.  Psoriasis and risk of atrial fibrillation and ischaemic stroke: a Danish Nationwide Cohort Study. , 2011, European heart journal.

[31]  Gene Feder,et al.  Prognosis of stable angina pectoris: why we need larger population studies with higher endpoint resolution , 2006, Heart.

[32]  C. Furberg,et al.  Risk of serious adverse cardiovascular events associated with varenicline: a systematic review and meta-analysis , 2011, Canadian Medical Association Journal.

[33]  Catherine P. Bradshaw,et al.  The use of propensity scores to assess the generalizability of results from randomized trials , 2011, Journal of the Royal Statistical Society. Series A,.

[34]  Spiros Denaxas,et al.  Type 2 diabetes and incidence of cardiovascular diseases: a cohort study in 1·9 million people , 2015, The Lancet.

[35]  P. Macfarlane,et al.  Long-term follow-up of the West of Scotland Coronary Prevention Study. , 2007, The New England journal of medicine.

[36]  Joy Adamson,et al.  The opportunities and challenges of pragmatic point-of-care randomised trials using routinely collected electronic records: evaluations of two exemplar trials. , 2014, Health technology assessment.

[37]  S. Brunak,et al.  Mining electronic health records: towards better research applications and clinical care , 2012, Nature Reviews Genetics.

[38]  C. Chute,et al.  Electronic Medical Records for Genetic Research: Results of the eMERGE Consortium , 2011, Science Translational Medicine.

[39]  J. Prochaska,et al.  Choice of summary statistics: relative and absolute measures , 2013, BMJ : British Medical Journal.

[40]  John Shawe-Taylor,et al.  Extracting Diagnoses and Investigation Results from Unstructured Text in Electronic Health Records by Semi-Supervised Machine Learning , 2012, PloS one.

[41]  Melissa A. Basford,et al.  Validation of electronic medical record-based phenotyping algorithms: results and lessons learned from the eMERGE network. , 2013, Journal of the American Medical Informatics Association : JAMIA.

[42]  Damian Smedley,et al.  The Human Phenotype Ontology project: linking molecular biology and disease through phenotype data , 2014, Nucleic Acids Res..

[43]  Stefan Neubauer,et al.  Left ventricular non-compaction: insights from cardiovascular magnetic resonance imaging. , 2005, Journal of the American College of Cardiology.

[44]  F. Collins,et al.  A new initiative on precision medicine. , 2015, The New England journal of medicine.

[45]  Yenn-Jiang Lin,et al.  Inflammation and the pathogenesis of atrial fibrillation , 2015, Nature Reviews Cardiology.

[46]  A. Woodcock,et al.  Obtaining real-world evidence: the Salford Lung Study , 2014, Thorax.

[47]  Kent A. Spackman,et al.  SNOMED clinical terms: overview of the development process and project status , 2001, AMIA.

[48]  Spiros C. Denaxas,et al.  Completeness and diagnostic validity of recording acute myocardial infarction events in primary care, hospital care, disease registry, and national mortality records: cohort study , 2013, BMJ.

[49]  Mark Lunt,et al.  Psoriasis and the Risk of Major Cardiovascular Events: Cohort Study Using the Clinical Practice Research Datalink. , 2015, The Journal of investigative dermatology.

[50]  R. Panush Risk of atrial fibrillation and stroke in rheumatoid arthritis: Danish nationwide cohort study , 2012 .

[51]  L. Smeeth,et al.  The Myocardial Ischaemia National Audit Project (MINAP) , 2010, Heart.

[52]  Katie Brittain,et al.  The effectiveness of collaborative care for people with memory problems in primary care: results of the CAREDEM case management modelling and feasibility study. , 2014, Health technology assessment.

[53]  J. Pathak,et al.  Electronic health records-driven phenotyping: challenges, recent advances, and perspectives. , 2013, Journal of the American Medical Informatics Association : JAMIA.

[54]  T. V. van Staa,et al.  Recent advances in the utility and use of the General Practice Research Database as an example of a UK Primary Care Data resource , 2012, Therapeutic advances in drug safety.

[55]  W. Kannel,et al.  Diabetes and cardiovascular disease. The Framingham study. , 1979, JAMA.

[56]  D. Boyd,et al.  CRITICAL QUESTIONS FOR BIG DATA , 2012 .

[57]  Katherine I. Morley,et al.  Defining Disease Phenotypes Using National Linked Electronic Health Records: A Case Study of Atrial Fibrillation , 2014, PloS one.

[58]  John P. A. Ioannidis,et al.  American Journal of Epidemiology Commentary the Emergence of Translational Epidemiology: from Scientific Discovery to Population Health Impact , 2022 .

[59]  B. Howard,et al.  Diabetes and cardiovascular disease , 2000, Annual review of medicine.

[60]  Melissa A. Basford,et al.  The Electronic Medical Records and Genomics (eMERGE) Network: past, present, and future , 2013, Genetics in Medicine.

[61]  H. Ghassemian,et al.  Detection of atrial fibrillation episodes using SVM , 2008, 2008 30th Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[62]  T. Dawber,et al.  The Framingham Study , 2014 .