Invited commentary: Observational research in the age of the electronic health record.

Historically, clinical epidemiologic research has been constrained by the costs and time associated with manually identifying cases and abstracting clinical data. In this issue, Carrell et al. (Am J Epidemiol. 2014;179(6);749-758) report on their impressive success using natural language processing techniques to correctly identify cases of cancer recurrence among women with previous breast cancer. They report a 10-fold decrease in the need for chart abstraction, though with an 8% loss in case detection. This commentary outlines some recent history associated with the development of "high-throughput clinical phenotyping" of electronic health records and speculates on the impact such computational capabilities may have for observational research and patient consent.

[1]  D. Blumenthal,et al.  The "meaningful use" regulation for electronic health records. , 2010, The New England journal of medicine.

[2]  Olson Jacob Duane About the Director , 2015 .

[3]  Christopher G. Chute,et al.  The absence of longitudinal data limits the accuracy of high-throughput clinical phenotyping for identifying type 2 diabetes mellitus subjects , 2013, Int. J. Medical Informatics.

[4]  Christopher G Chute,et al.  Discovering peripheral arterial disease cases from radiology notes using natural language processing. , 2010, AMIA ... Annual Symposium proceedings. AMIA Symposium.

[5]  Wendy A. Wolf,et al.  The eMERGE Network: A consortium of biorepositories linked to electronic medical records data for conducting genomic studies , 2011, BMC Medical Genomics.

[6]  Christopher G Chute,et al.  Analyzing the heterogeneity and complexity of Electronic Health Record oriented phenotyping algorithms. , 2011, AMIA ... Annual Symposium proceedings. AMIA Symposium.

[7]  Scott R. Halgrim,et al.  Using natural language processing to improve efficiency of manual chart abstraction in research: the case of breast cancer recurrence. , 2014, American journal of epidemiology.

[8]  Jin Fan,et al.  Leveraging informatics for genetic studies: use of the electronic medical record to enable a genome-wide association study of peripheral arterial disease , 2010, J. Am. Medical Informatics Assoc..

[9]  Christopher G Chute,et al.  The SHARPn project on secondary use of Electronic Medical Record data: progress, plans, and possibilities. , 2011, AMIA ... Annual Symposium proceedings. AMIA Symposium.

[10]  Cui Tao,et al.  Building a robust, scalable and standards-driven infrastructure for secondary use of EHR data: The SHARPn project , 2012, J. Biomed. Informatics.

[11]  Sunghwan Sohn,et al.  Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications , 2010, J. Am. Medical Informatics Assoc..

[12]  Melissa A. Basford,et al.  The Electronic Medical Records and Genomics (eMERGE) Network: past, present, and future , 2013, Genetics in Medicine.

[13]  Pedro J. Caraballo,et al.  Impact of data fragmentation across healthcare centers on the accuracy of a high-throughput clinical phenotyping algorithm for specifying subjects with type 2 diabetes mellitus , 2012, J. Am. Medical Informatics Assoc..

[14]  Christopher G. Chute,et al.  Some experiences and opportunities for big data in translational research , 2013, Genetics in Medicine.

[15]  Suzette J. Bielinski,et al.  Use of diverse electronic medical record systems to identify genetic risk for type 2 diabetes within a genome-wide association study , 2012, J. Am. Medical Informatics Assoc..

[16]  C. Chute,et al.  Electronic Medical Records for Genetic Research: Results of the eMERGE Consortium , 2011, Science Translational Medicine.

[17]  Melissa A. Basford,et al.  Validation of electronic medical record-based phenotyping algorithms: results and lessons learned from the eMERGE network. , 2013, Journal of the American Medical Informatics Association : JAMIA.