Optimising the use of electronic health records to estimate the incidence of rheumatoid arthritis in primary care: what information is hidden in free text?

BackgroundPrimary care databases are a major source of data for epidemiological and health services research. However, most studies are based on coded information, ignoring information stored in free text. Using the early presentation of rheumatoid arthritis (RA) as an exemplar, our objective was to estimate the extent of data hidden within free text, using a keyword search.MethodsWe examined the electronic health records (EHRs) of 6,387 patients from the UK, aged 30 years and older, with a first coded diagnosis of RA between 2005 and 2008. We listed indicators for RA which were present in coded format and ran keyword searches for similar information held in free text. The frequency of indicator code groups and keywords from one year before to 14 days after RA diagnosis were compared, and temporal relationships examined.ResultsOne or more keyword for RA was found in the free text in 29% of patients prior to the RA diagnostic code. Keywords for inflammatory arthritis diagnoses were present for 14% of patients whereas only 11% had a diagnostic code. Codes for synovitis were found in 3% of patients, but keywords were identified in an additional 17%. In 13% of patients there was evidence of a positive rheumatoid factor test in text only, uncoded. No gender differences were found. Keywords generally occurred close in time to the coded diagnosis of rheumatoid arthritis. They were often found under codes indicating letters and communications.ConclusionsPotential cases may be missed or wrongly dated when coded data alone are used to identify patients with RA, as diagnostic suspicions are frequently confined to text. The use of EHRs to create disease registers or assess quality of care will be misleading if free text information is not taken into account. Methods to facilitate the automated processing of text need to be developed and implemented.

[1]  Greta Rait,et al.  Optimising Use of Electronic Health Records to Describe the Presentation of Rheumatoid Arthritis in Primary Care: A Strategy for Developing Code Lists , 2013, PloS one.

[2]  D. Blumenthal,et al.  The "meaningful use" regulation for electronic health records. , 2010, The New England journal of medicine.

[3]  Peter Croft,et al.  Quality of morbidity coding in general practice computerized medical records: a systematic review. , 2004, Family practice.

[4]  David A Hanauer,et al.  Informatics and the American College of Surgeons National Surgical Quality Improvement Program: automated processes could replace manual record review. , 2009, Journal of the American College of Surgeons.

[5]  K. Hoogenberg,et al.  Computerized extraction of information on the quality of diabetes care from free text in electronic patient records of general practitioners. , 2007, Journal of the American Medical Informatics Association : JAMIA.

[6]  Irene Petersen,et al.  Creating medical and drug code lists to identify cases in primary care databases , 2009, Pharmacoepidemiology and drug safety.

[7]  C. Jinks,et al.  Health care utilization: measurement using primary care records and patient recall both showed bias. , 2006, Journal of clinical epidemiology.

[8]  Kai Zheng,et al.  Handling anticipated exceptions in clinical care: investigating clinician use of 'exit strategies' in an electronic health records system , 2011, J. Am. Medical Informatics Assoc..

[9]  JoAnn E Manson,et al.  Accuracy of Administrative Coding for Type 2 Diabetes in Children, Adolescents, and Young Adults , 2007, Diabetes Care.

[10]  Hua Xu,et al.  Data from clinical notes: a perspective on the tension between structure and flexible documentation , 2011, J. Am. Medical Informatics Assoc..

[11]  D. Blumenthal Launching HITECH. , 2010, The New England journal of medicine.

[12]  T. Stukel,et al.  Importance of accurately identifying disease in studies using electronic health records , 2010, BMJ : British Medical Journal.

[13]  V. Allgar,et al.  Identifying patients with a cancer diagnosis using general practice medical records and Cancer Registry data. , 2008, Family practice.

[14]  Robin C. Meili,et al.  Can electronic medical record systems transform health care? Potential health benefits, savings, and costs. , 2005, Health affairs.

[15]  J. Denny,et al.  Naïve Electronic Health Record phenotype identification for Rheumatoid arthritis. , 2011, AMIA ... Annual Symposium proceedings. AMIA Symposium.

[16]  John A. Carroll,et al.  Lexical Acquisition for Clinical Text Mining Using Distributional Similarity , 2012, CICLing.

[17]  Serguei V. S. Pakhomov,et al.  Epidemiology of angina pectoris: role of natural language processing of the medical record. , 2007, American heart journal.

[18]  A Rosemary Tate,et al.  Using free text information to explore how and when GPs code a diagnosis of ovarian cancer: an observational study using primary care records of patients with ovarian cancer , 2011, BMJ Open.

[19]  Prospective study of elderly people comparing treatments following first primary care consultation for a symptomatic hip or knee. , 2004, Family practice.

[20]  I. Kohane,et al.  Electronic medical records for discovery research in rheumatoid arthritis , 2010, Arthritis care & research.

[21]  Santiago G. Moreno,et al.  BMC Medical Research Methodology , 2009 .

[22]  Trisha Greenhalgh,et al.  Studying technology use as social practice: the untapped potential of ethnography , 2011, BMC medicine.

[23]  M. Samore,et al.  Combining Free Text and Structured Electronic Medical Record Entries to Detect Acute Respiratory Infections , 2010, PloS one.

[24]  J. Achkar,et al.  Using Technology to Promote Gastrointestinal Outcomes Research: A Case for Electronic Health Records , 2008, The American Journal of Gastroenterology.

[25]  Trisha Greenhalgh,et al.  Ethnographic study of ICT-supported collaborative work routines in general practice , 2010, BMC health services research.

[26]  Hua Xu,et al.  Portability of an algorithm to identify rheumatoid arthritis in electronic health records , 2012, J. Am. Medical Informatics Assoc..

[27]  K. Jordan,et al.  The use of general practice consultation databases in rheumatology research. , 2006, Rheumatology.

[28]  John F. Hurdle,et al.  Extracting Information from Textual Documents in the Electronic Health Record: A Review of Recent Research , 2008, Yearbook of Medical Informatics.

[29]  David A Hanauer,et al.  The registry case finding engine: an automated tool to identify cancer cases from unstructured, free-text pathology reports and clinical notes. , 2007, Journal of the American College of Surgeons.

[30]  L. Smeeth,et al.  Pragmatic randomised trials using routine electronic health records: putting them to the test , 2012, BMJ : British Medical Journal.

[31]  Rob Koeling,et al.  Automatically estimating the incidence of symptoms recorded in GP free text notes , 2011, MIXHS '11.