Prevalence of heart failure signs and symptoms in a large primary care population identified through the use of text and data mining of the electronic health record.

BACKGROUND The electronic health record (EHR) contains a tremendous amount of data that if appropriately detected can lead to earlier identification of disease states such as heart failure (HF). Using a novel text and data analytic tool we explored the longitudinal EHR of over 50,000 primary care patients to identify the documentation of the signs and symptoms of HF in the years preceding its diagnosis. METHODS AND RESULTS Retrospective analysis consisted of 4,644 incident HF cases and 45,981 group-matched control subjects. Documentation of Framingham HF signs and symptoms within encounter notes were carried out with the use of a previously validated natural language processing procedure. A total of 892,805 affirmed criteria were documented over an average observation period of 3.4 years. Among eventual HF cases, 85% had ≥1 criterion within 1 year before their HF diagnosis, as did 55% of control subjects. Substantial variability in the prevalence of individual signs and symptoms were found in both case and control subjects. CONCLUSIONS HF signs and symptoms are frequently documented in a primary care population as identified through automated text and data mining of EHRs. Their frequent identification demonstrates the rich data available within EHRs that will allow for future work on automated criterion identification to help develop predictive models for HF.

[1]  Unrecognized heart failure in elderly patients with stable chronic obstructive pulmonary disease , 2005 .

[2]  Juerg Schwitter,et al.  ESC Guidelines for the diagnosis and treatment of acute and chronic heart failure 2012 , 2010, European journal of heart failure.

[3]  D. Mozaffarian,et al.  Executive summary: heart disease and stroke statistics--2010 update: a report from the American Heart Association. , 2010, Circulation.

[4]  Derek J Van Booven,et al.  Loss-of-function DNA sequence variant in the CLCNKA chloride channel implicates the cardio-renal axis in interindividual heart failure risk variation , 2011, Proceedings of the National Academy of Sciences.

[5]  Joshua C. Denny,et al.  Chapter 13: Mining Electronic Health Records in the Genomics Era , 2012, PLoS Comput. Biol..

[6]  P. Whincup,et al.  &ggr;-Glutamyltransferase, Hepatic Enzymes, and Risk of Incident Heart Failure in Older Men , 2012, Arteriosclerosis, thrombosis, and vascular biology.

[7]  J. Ornato,et al.  ACC/AHA 2005 Guideline Update for the Diagnosis and Management of Chronic Heart Failure in the Adult—Summary Article , 2005 .

[8]  Jimeng Sun,et al.  Automatic identification of heart failure diagnostic criteria, using text analysis of clinical notes from electronic health records , 2014, Int. J. Medical Informatics.

[9]  Eric Boerwinkle,et al.  Cardiac Troponin T Measured by a Highly Sensitive Assay Predicts Coronary Heart Disease, Heart Failure, and Mortality in the Atherosclerosis Risk in Communities Study , 2011, Circulation.

[10]  Joshua C Denny,et al.  Generating Clinical Notes for Electronic Health Record Systems , 2010, Applied Clinical Informatics.

[11]  Helmut Baumgartner,et al.  ESC Guidelines for the diagnosis and treatment of acute and chronic heart failure 2012 , 2012, European journal of heart failure.

[12]  H. Tunstall-Pedoe,et al.  Biochemical detection of left-ventricular systolic dysfunction , 1998, The Lancet.

[13]  G. Dorn,et al.  The genomic architecture of sporadic heart failure. , 2011, Circulation research.

[14]  Clement J. McDonald,et al.  What can natural language processing do for clinical decision support? , 2009, J. Biomed. Informatics.

[15]  A. Khera,et al.  Forecasting the Future of Cardiovascular Disease in the United States: A Policy Statement From the American Heart Association , 2011, Circulation.

[16]  A. Hoes,et al.  Classification of heart failure in population based research: An assessment of six heart failure scores , 1997, European Journal of Epidemiology.

[17]  W. Kannel,et al.  The natural history of congestive heart failure: the Framingham study. , 1971, The New England journal of medicine.

[18]  A. Hungin,et al.  Barriers to accurate diagnosis and effective management of heart failure in primary care: qualitative study , 2003, BMJ : British Medical Journal.

[19]  Jason Roy,et al.  Prediction Modeling Using EHR Data: Challenges, Strategies, and a Comparison of Machine Learning Approaches , 2010, Medical care.