Fusing Continuous-Valued Medical Labels Using a Bayesian Model

With the rapid increase in volume of time series medical data available through wearable devices, there is a need to employ automated algorithms to label data. Examples of labels include interventions, changes in activity (e.g. sleep) and changes in physiology (e.g. arrhythmias). However, automated algorithms tend to be unreliable resulting in lower quality care. Expert annotations are scarce, expensive, and prone to significant inter- and intra-observer variance. To address these problems, a Bayesian Continuous-valued Label Aggregator (BCLA) is proposed to provide a reliable estimation of label aggregation while accurately infer the precision and bias of each algorithm. The BCLA was applied to QT interval (pro-arrhythmic indicator) estimation from the electrocardiogram using labels from the 2006 PhysioNet/Computing in Cardiology Challenge database. It was compared to the mean, median, and a previously proposed Expectation Maximization (EM) label aggregation approaches. While accurately predicting each labelling algorithm’s bias and precision, the root-mean-square error of the BCLA was 11.78 ± 0.63 ms, significantly outperforming the best Challenge entry (15.37 ± 2.13 ms) as well as the EM, mean, and median voting strategies (14.76 ± 0.52, 17.61 ± 0.55, and 14.43 ± 0.57 ms respectively with p < 0.0001). The BCLA could therefore provide accurate estimation for medical continuous-valued label tasks in an unsupervised manner even when the ground truth is not available.

[1]  Ohad Shamir,et al.  Good learners for evil teachers , 2009, ICML '09.

[2]  Borje Darpo,et al.  Clinical Evaluation of QT/QTc Prolongation and Proarrhythmic Potential for Nonantiarrhythmic Drugs: The International Conference on Harmonization of Technical Requirements for Registration of Pharmaceuticals for Human Use E14 Guideline , 2006, Journal of clinical pharmacology.

[3]  Li Zhang,et al.  Inaccurate electrocardiographic interpretation of long QT: the majority of physicians cannot recognize a long QT when they see one. , 2005, Heart rhythm.

[4]  Morrison Hodges,et al.  Rate Correction of the QT Interval , 1997 .

[5]  Milos Hauskrecht,et al.  Learning Medical Diagnosis Models from Multiple Experts , 2012, AMIA.

[6]  M Demeester,et al.  Assessment of the performance of electrocardiographic computer programs with the use of a reference data base. , 1985, Circulation.

[7]  A. Moss,et al.  QT Interval: How to Measure It and What Is “Normal” , 2006, Journal of cardiovascular electrophysiology.

[8]  M. Fine,et al.  Does this patient have community-acquired pneumonia? Diagnosing pneumonia by history and physical examination. , 1997, JAMA.

[9]  Ivan Dotsinsky,et al.  Clifford Gari D, Azuaje Francisco, McSharry Patrick E, Eds: Advanced Methods and Tools for ECG Analysis , 2007 .

[10]  A. Kadish,et al.  Relation between QT and RR intervals during exercise testing in atrial fibrillation. , 1992, The American journal of cardiology.

[11]  L. Bonomo,et al.  Interobserver variability of dynamic MR imaging of the temporomandibular joint , 2011, La radiologia medica.

[12]  Pietro Perona,et al.  The Multidimensional Wisdom of Crowds , 2010, NIPS.

[13]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[14]  Ivan Dotsinsky,et al.  Dataset of manually measured QT intervals in the electrocardiogram , 2006, Biomedical engineering online.

[15]  A. Camm,et al.  Relation between QT and RR intervals is highly individual among healthy subjects: implications for heart rate correction of the QT interval , 2002, Heart.

[16]  C. Garnett,et al.  Highly Automated QT Measurement Techniques in 7 Thorough QT Studies Implemented under ICH E14 Guidelines , 2011, Annals of noninvasive electrocardiology : the official journal of the International Society for Holter and Noninvasive Electrocardiology, Inc.

[17]  Jatinder Singh International conference on harmonization of technical requirements for registration of pharmaceuticals for human use , 2015, Journal of pharmacology & pharmacotherapeutics.

[18]  Jihui Ma,et al.  Variability of QT Interval Measurements in Opioid‐Dependent Patients on Methadone , 2014 .

[19]  Joachim Behar,et al.  Crowd-Sourced Annotation of ECG Signals Using Contextual Information , 2013, Annals of Biomedical Engineering.

[20]  William M. Wells,et al.  Validation of image segmentation by estimating rater bias and variance , 2008, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[21]  Gari D. Clifford,et al.  CrowdLabel: A crowdsourcing platform for electrophysiology , 2014, Computing in Cardiology 2014.

[22]  Pietro Perona,et al.  Online crowdsourcing: Rating annotators and obtaining cost-effective labels , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[23]  Ralf Bousseljot,et al.  Nutzung der EKG-Signaldatenbank CARDIODAT der PTB über das Internet , 2009 .

[24]  S. Salerno,et al.  Competency in Interpretation of 12-Lead Electrocardiograms: A Summary and Appraisal of Published Evidence , 2003, Annals of Internal Medicine.

[25]  Pietro Perona,et al.  Sleep spindle detection: crowdsourcing and evaluating performance of experts, non-experts, and automated methods , 2014, Nature Methods.

[26]  Gerardo Hermosillo,et al.  Learning From Crowds , 2010, J. Mach. Learn. Res..

[27]  H. Koch,et al.  The PhysioNet/Computers in Cardiology Challenge 2006: QT interval measurement , 2006, 2006 Computers in Cardiology.

[28]  A. P. Dawid,et al.  Maximum Likelihood Estimation of Observer Error‐Rates Using the EM Algorithm , 1979 .

[29]  Nicholas Peter Hughes,et al.  Probabilistic Models for Automated ECG Interval Analysis , 2006 .