Identifying health information technology related safety event reports from patient safety event report databases

OBJECTIVE The objective of this paper was to identify health information technology (HIT) related events from patient safety event (PSE) report free-text descriptions. A difference-based scoring approach was used to prioritize and select model features. A feature-constraint model was developed and evaluated to support the analysis of PSE reports. METHODS 5287 PSE reports manually coded as likely or unlikely related to HIT were used to train unigram, bigram, and combined unigram-bigram logistic regression and support vector machine models using five-fold cross validation. A difference-based scoring approach was used to prioritize and select unigram and bigram features by their relative importance to likely and unlikely HIT reports. A held-out set of 2000 manually coded reports were used for testing. RESULTS Unigram models tended to perform better than bigram and combined models. A 300-unigram logistic regression had comparable classification performance to a 4030-unigram SVM model but with a faster relative run-time. The 300-unigram logistic regression model evaluated with the testing data had an AUC of 0.931 and a F1-score of 0.765. DISCUSSION A difference-based scoring, prioritization, and feature selection approach can be used to generate simplified models with high performance. A feature-constraint model may be more easily shared across healthcare organizations seeking to analyze their respective datasets and customized for local variations in PSE reporting practices. CONCLUSION The feature-constraint model provides a method to identify HIT-related patient safety hazards using a method that is applicable across healthcare systems with variability in their PSE report structures.

[1]  Deqing Wang,et al.  t-Test feature selection approach based on term frequency for text categorization , 2014, Pattern Recognit. Lett..

[2]  Carl Macrae,et al.  The problem with incident reporting , 2015, BMJ Quality & Safety.

[3]  Marianthi Markatou,et al.  Text mining for the Vaccine Adverse Event Reporting System: medical text classification using informative feature selection , 2011, J. Am. Medical Informatics Assoc..

[4]  Peter J Pronovost,et al.  Establishing a global learning community for incident-reporting systems , 2010, Quality and Safety in Health Care.

[5]  Samah Jamal Fodeh,et al.  Electronic approaches to making sense of the text in the adverse event reporting system. , 2016, Journal of healthcare risk management : the journal of the American Society for Healthcare Risk Management.

[6]  Rollin J Fairbanks,et al.  Electronic Health Record Vendor Adherence to Usability Certification Requirements and Testing Standards. , 2015, JAMA.

[7]  John P. A. Ioannidis,et al.  Opportunities and challenges in developing risk prediction models with electronic health records data: a systematic review , 2017, J. Am. Medical Informatics Assoc..

[8]  Donia Scott,et al.  Extracting information from the text of electronic medical records to improve case detection: a systematic review , 2016, J. Am. Medical Informatics Assoc..

[9]  Farah Magrabi,et al.  An analysis of computer-related patient safety incidents to inform the development of a classification , 2010, J. Am. Medical Informatics Assoc..

[10]  P. Pronovost,et al.  Patient safety incident reporting: a qualitative study of thoughts and perceptions of experts 15 years after ‘To Err is Human’ , 2015, BMJ Quality & Safety.

[11]  Fiona Fui-Hoon Nah,et al.  A study on tolerable waiting time: how long are Web users willing to wait? , 2004, AMCIS.

[12]  Sean W. Smith,et al.  Healthcare information technology's relativity problems: a typology of how patients' physical reality, clinicians' mental models, and healthcare information technology differ , 2014, J. Am. Medical Informatics Assoc..

[13]  Eric O. Postma,et al.  Dimensionality Reduction: A Comparative Review , 2008 .

[14]  Ying Wang,et al.  Using multiclass classification to automate the identification of patient safety incident reports by type and severity , 2017, BMC Medical Informatics and Decision Making.

[15]  Farah Magrabi,et al.  Using statistical text classification to identify health information technology incidents , 2013, J. Am. Medical Informatics Assoc..

[16]  Mary G. Amato,et al.  Computerized prescriber order entry–related patient safety reports: analysis of 2522 medication errors , 2017, J. Am. Medical Informatics Assoc..

[17]  George Hripcsak,et al.  Automated detection of adverse events using natural language processing of discharge summaries. , 2005, Journal of the American Medical Informatics Association : JAMIA.

[18]  Farah Magrabi,et al.  Building Usability Knowledge for Health Information Technology: A Usability-Oriented Analysis of Incident Reports , 2019, Applied Clinical Informatics.

[19]  Raj M. Ratwani,et al.  Integrating natural language processing expertise with patient safety event review committees to improve the analysis of medication events , 2017, Int. J. Medical Informatics.

[20]  Farah Magrabi,et al.  Automated categorisation of clinical incident reports using statistical text classification , 2010, Quality and Safety in Health Care.

[21]  J. Firth‐Cozens,et al.  Telling patients the truth: a systems approach to disclosing adverse events , 2002, Quality & safety in health care.

[22]  Rainu Kaushal,et al.  Effects of health information technology on patient outcomes: a systematic review , 2016, J. Am. Medical Informatics Assoc..

[23]  Farah Magrabi,et al.  Automated identification of extreme-risk events in clinical incident reports , 2012, J. Am. Medical Informatics Assoc..

[24]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[25]  Gautham Suresh,et al.  Automated detection of harm in healthcare with information technology: a systematic review , 2010, Quality and Safety in Health Care.

[26]  Andrew W. Young,et al.  Exploring the perception of social characteristics in faces using the isolation effect , 2005 .

[27]  Carol Friedman,et al.  Combing signals from spontaneous reports and electronic health records for detection of adverse drug reactions , 2013, J. Am. Medical Informatics Assoc..

[28]  David G. Rand,et al.  Structural Topic Models for Open‐Ended Survey Responses , 2014, American Journal of Political Science.