Using anchors from free text in electronic health records to diagnose postoperative delirium

OBJECTIVES Postoperative delirium is a common complication after major surgery among the elderly. Despite its potentially serious consequences, the complication often goes undetected and undiagnosed. In order to provide diagnosis support one could potentially exploit the information hidden in free text documents from electronic health records using data-driven clinical decision support tools. However, these tools depend on labeled training data and can be both time consuming and expensive to create. METHODS The recent learning with anchors framework resolves this problem by transforming key observations (anchors) into labels. This is a promising framework, but it is heavily reliant on clinicians knowledge for specifying good anchor choices in order to perform well. In this paper we propose a novel method for specifying anchors from free text documents, following an exploratory data analysis approach based on clustering and data visualization techniques. We investigate the use of the new framework as a way to detect postoperative delirium. RESULTS By applying the proposed method to medical data gathered from a Norwegian university hospital, we increase the area under the precision-recall curve from 0.51 to 0.96 compared to baselines. CONCLUSIONS The proposed approach can be used as a framework for clinical decision support for postoperative delirium.

[1]  Jean-Philippe Vert,et al.  A bagging SVM to learn from positive and unlabeled examples , 2010, Pattern Recognit. Lett..

[2]  J. Gower,et al.  Minimum Spanning Trees and Single Linkage Cluster Analysis , 1969 .

[3]  Keinosuke Fukunaga,et al.  A Graph-Theoretic Approach to Nonparametric Cluster Analysis , 1976, IEEE Transactions on Computers.

[4]  David Sontag,et al.  Electronic medical record phenotyping using the anchor and learn framework , 2016, J. Am. Medical Informatics Assoc..

[5]  Ian T. Jolliffe,et al.  Principal Component Analysis , 2002, International Encyclopedia of Statistical Science.

[6]  John F. Hurdle,et al.  Extracting Information from Textual Documents in the Electronic Health Record: A Review of Recent Research , 2008, Yearbook of Medical Informatics.

[7]  M. Johnson,et al.  Circulating microRNAs in Sera Correlate with Soluble Biomarkers of Immune Activation but Do Not Predict Mortality in ART Treated Individuals with HIV-1 Infection: A Case Control Study , 2015, PloS one.

[8]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[10]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[11]  Zoë Tieges,et al.  Abnormal level of arousal as a predictor of delirium and inattention: an exploratory study. , 2013, The American journal of geriatric psychiatry : official journal of the American Association for Geriatric Psychiatry.

[12]  Charles Elkan,et al.  Learning classifiers from only positive and unlabeled data , 2008, KDD.

[13]  S. Brunak,et al.  Mining electronic health records: towards better research applications and clinical care , 2012, Nature Reviews Genetics.

[14]  Ana L. N. Fred,et al.  Mode Seeking Clustering by KNN and Mean Shift Evaluated , 2012, SSPR/SPR.

[15]  Xin Jin,et al.  Mean Shift , 2017, Encyclopedia of Machine Learning and Data Mining.

[16]  Ewout W. Steyerberg,et al.  Risk Factors and Outcomes for Postoperative Delirium after Major Surgery in Elderly Patients , 2015, PloS one.

[17]  Nicola Ward,et al.  Nutrition support to patients undergoing gastrointestinal surgery , 2003, Nutrition journal.

[18]  David Sontag,et al.  Using Anchors to Estimate Clinical State without Labeled Data , 2014, AMIA.

[19]  Christodoulos Stefanadis,et al.  The Adoption of Mediterranean Diet Attenuates the Development of Acute Coronary Syndromes in People with the Metabolic Syndrome , 2003, Nutrition journal.

[20]  Jianfeng Gao,et al.  Scalable training of L1-regularized log-linear models , 2007, ICML '07.

[21]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[22]  Dale M. Needham,et al.  Postoperative delirium in older adults: best practice statement from the American Geriatrics Society. , 2015, Journal of the American College of Surgeons.

[23]  Robert Jenssen,et al.  Consensus Clustering Using kNN Mode Seeking , 2015, SCIA.

[24]  Yizong Cheng,et al.  Mean Shift, Mode Seeking, and Clustering , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[25]  José Luis Rojo-Álvarez,et al.  Predicting colorectal surgical complications using heterogeneous clinical data and kernel methods , 2016, J. Biomed. Informatics.

[26]  Mark Goadrich,et al.  The relationship between Precision-Recall and ROC curves , 2006, ICML.

[27]  Torgeir Bruun Wyller,et al.  New consciousness scale for delirium. , 2014, Tidsskrift for den Norske laegeforening : tidsskrift for praktisk medicin, ny raekke.

[28]  B. Efron The Bootstrap and Modern Statistics , 2000 .

[29]  Vipin Kumar,et al.  Mining Electronic Health Records: A Survey , 2017, 1702.03222.

[30]  J. Hewitt,et al.  Systematic review and meta‐analysis of risk factors for postoperative delirium among older patients undergoing gastrointestinal surgery , 2016, The British journal of surgery.

[31]  U. Rajendra Acharya,et al.  Machine Learning in Healthcare Informatics , 2013, Machine Learning in Healthcare Informatics.

[32]  Ben Eiseman,et al.  Postoperative delirium in the elderly: diagnosis and management , 2008, Clinical interventions in aging.

[33]  Bart De Moor,et al.  Building Classifiers to Predict the Start of Glucose-Lowering Pharmacotherapy Using Belgian Health Expenditure Data , 2015, ArXiv.

[34]  Christian Ohmann,et al.  Predicting Delirium After Vascular Surgery: A Model Based on Pre- and Intraoperative Data , 2003, Annals of surgery.

[35]  Nigam H. Shah,et al.  Toward personalizing treatment for depression: predicting diagnosis and severity , 2014, J. Am. Medical Informatics Assoc..

[36]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[37]  T. Sakabe,et al.  Prediction of postoperative delirium after abdominal surgery in the elderly , 2009, Journal of Anesthesia.

[38]  José Luis Rojo-Álvarez,et al.  Ontology for Heart Rate Turbulence Domain From The Conceptual Model of SNOMED-CT , 2013, IEEE Transactions on Biomedical Engineering.

[39]  S. Deiner,et al.  Postoperative delirium and cognitive dysfunction. , 2009, British journal of anaesthesia.

[40]  José Luis Rojo-Álvarez,et al.  Support Vector Feature Selection for Early Detection of Anastomosis Leakage From Bag-of-Words in Electronic Health Records , 2016, IEEE Journal of Biomedical and Health Informatics.

[41]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[42]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[43]  Stephen B. Johnson,et al.  A review of approaches to identifying patient phenotype cohorts using electronic health records , 2013, J. Am. Medical Informatics Assoc..

[45]  Johan A. K. Suykens,et al.  A robust ensemble approach to learn from positive and unlabeled data using SVM base models , 2014, Neurocomputing.