Automation bias in medicine: The influence of automated diagnoses on interpreter accuracy and uncertainty when reading electrocardiograms.

INTRODUCTION Interpretation of the 12‑lead Electrocardiogram (ECG) is normally assisted with an automated diagnosis (AD), which can facilitate an 'automation bias' where interpreters can be anchored. In this paper, we studied, 1) the effect of an incorrect AD on interpretation accuracy and interpreter confidence (a proxy for uncertainty), and 2) whether confidence and other interpreter features can predict interpretation accuracy using machine learning. METHODS This study analysed 9000 ECG interpretations from cardiology and non-cardiology fellows (CFs and non-CFs). One third of the ECGs involved no ADs, one third with ADs (half as incorrect) and one third had multiple ADs. Interpretations were scored and interpreter confidence was recorded for each interpretation and subsequently standardised using sigma scaling. Spearman coefficients were used for correlation analysis and C5.0 decision trees were used for predicting interpretation accuracy using basic interpreter features such as confidence, age, experience and designation. RESULTS Interpretation accuracies achieved by CFs and non-CFs dropped by 43.20% and 58.95% respectively when an incorrect AD was presented (p < 0.001). Overall correlation between scaled confidence and interpretation accuracy was higher amongst CFs. However, correlation between confidence and interpretation accuracy decreased for both groups when an incorrect AD was presented. We found that an incorrect AD disturbs the reliability of interpreter confidence in predicting accuracy. An incorrect AD has a greater effect on the confidence of non-CFs (although this is not statistically significant it is close to the threshold, p = 0.065). The best C5.0 decision tree achieved an accuracy rate of 64.67% (p < 0.001), however this is only 6.56% greater than the no-information-rate. CONCLUSION Incorrect ADs reduce the interpreter's diagnostic accuracy indicating an automation bias. Non-CFs tend to agree more with the ADs in comparison to CFs, hence less expert physicians are more effected by automation bias. Incorrect ADs reduce the interpreter's confidence and also reduces the predictive power of confidence for predicting accuracy (even more so for non-CFs). Whilst a statistically significant model was developed, it is difficult to predict interpretation accuracy using machine learning on basic features such as interpreter confidence, age, reader experience and designation.

[1]  Raja Chatila,et al.  The IEEE Global Initiative for Ethical Considerations in Artificial Intelligence and Autonomous Systems [Standards] , 2017, IEEE Robotics Autom. Mag..

[2]  D Guldenring,et al.  The effects of electrode misplacement on clinicians' interpretation of the standard 12-lead electrocardiogram. , 2012, European journal of internal medicine.

[3]  R R Bond,et al.  Assessing computerized eye tracking technology for gaining insight into expert interpretation of the 12-lead electrocardiogram: an objective quantitative approach. , 2014, Journal of electrocardiology.

[4]  Raymond R. Bond,et al.  The role of computerized diagnostic proposals in the interpretation of the 12-lead electrocardiogram by cardiology and non-cardiology fellows , 2017, Int. J. Medical Informatics.

[5]  Dewar D. Finlay,et al.  Variable diagnostic accuracy in reading ECGs in a nurse-led primary PCI pathway , 2017 .

[6]  Dewar D. Finlay,et al.  Human factors analysis of the CardioQuick Patch®: A novel engineering solution to the problem of electrode misplacement during 12-lead electrocardiogram acquisition. , 2016, Journal of electrocardiology.

[7]  Raja Parasuraman,et al.  Performance Consequences of Automation-Induced 'Complacency' , 1993 .

[8]  Nadine B. Sarter,et al.  Supporting Trust Calibration and the Effective Use of Decision Aids by Presenting Dynamic System Confidence Information , 2006, Hum. Factors.

[9]  Daniel Guldenring,et al.  Using computerised interactive response technology to assess electrocardiographers and for aggregating diagnoses. , 2015, Journal of electrocardiology.

[10]  Daniel Guldenring,et al.  A decision support system and rule-based algorithm to augment the human interpretation of the 12-lead electrocardiogram. , 2017, Journal of electrocardiology.

[11]  Raymond R. Bond,et al.  A computer-human interaction model to improve the diagnostic accuracy and clinical decision-making during 12-lead electrocardiogram interpretation , 2016, J. Biomed. Informatics.

[12]  Daniel Guldenring,et al.  Methods for presenting and visualising electrocardiographic data: From temporal signals to spatial imaging. , 2013, Journal of electrocardiology.

[13]  Daniel Guldenring,et al.  Effects of electrode placement errors in the EASI-derived 12-lead electrocardiogram. , 2010, Journal of electrocardiology.

[14]  Dewar D. Finlay,et al.  ECG recording sites for improving signal-to-noise ratio during atrial depolarisation , 2014, Computing in Cardiology 2014.

[15]  Daniel Guldenring,et al.  A simulation tool for visualizing and studying the effects of electrode misplacement on the 12-lead electrocardiogram. , 2011, Journal of electrocardiology.

[16]  D. Kahneman Thinking, Fast and Slow , 2011 .

[17]  Max Kuhn,et al.  The caret Package , 2007 .

[18]  Abdul V. Roudsari,et al.  Automation bias: a systematic review of frequency, effect mediators, and mitigators , 2012, J. Am. Medical Informatics Assoc..

[19]  Daniel Guldenring,et al.  Data analysis of diagnostic accuracies in 12-lead electrocardiogram interpretation by junior medical fellows. , 2015, Journal of electrocardiology.