Demonstration of the potential of white-box machine learning approaches to gain insights from cardiovascular disease electrocardiograms

We present the results from a white-box machine learning approach to detect cardiac arrhythmias using electrocardiographic data. A C5.0 is trained to recognize four classes using common features. The four classes are (i) atrial fibrillation and atrial flutter, (ii) tachycardias (iii), sinus bradycardia and (iv) sinus rhythm. Data from 10,646 subjects, 83% of whom have at least one arrhythmia and 17% of whom exhibit a normal sinus rhythm, are used. The C5.0 is trained using 10-fold cross-validation and is able to achieve a balanced accuracy of 95.35%. By using the white-box machine learning approach, a clear and comprehensible tree structure can be revealed, which has selected the 5 most important features from a total of 24 features. These 5 features are ventricular rate, RR-Interval variation, atrial rate, age and difference between longest and shortest RR-Interval. The combination of ventricular rate, RR-Interval variation and atrial rate is especially relevant to achieve classification accuracy, which can be disclosed through the tree. The tree assigns unique values to distinguish the classes. These findings could be applied in medicine in the future. It can be shown that a white-box machine learning approach can reveal granular structures, thus confirming known linear relationships and also revealing nonlinear relationships. To highlight the strength of the C5.0 with respect to this structural revelation, the results of further white-box machine learning and black-box machine learning algorithms are presented.

[1]  A. Hasslocher-Moreno,et al.  T-wave axis deviation as an independent predictor of mortality in chronic Chagas' disease. , 2004, The American journal of cardiology.

[2]  Yixiang Huang,et al.  A hierarchical method based on weighted extreme gradient boosting in ECG heartbeat classification , 2019, Comput. Methods Programs Biomed..

[3]  David Menotti,et al.  ECG arrhythmia classification based on optimum-path forest , 2013, Expert Syst. Appl..

[4]  E. Finkelstein,et al.  Development and Validation of a Deep Learning System for Diabetic Retinopathy and Related Eye Diseases Using Retinal Images From Multiethnic Populations With Diabetes , 2017, JAMA.

[5]  Javier Del Ser,et al.  ECG-based pulse detection during cardiac arrest using random forest classifier , 2018, Medical & Biological Engineering & Computing.

[6]  Chandan Singh,et al.  Definitions, methods, and applications in interpretable machine learning , 2019, Proceedings of the National Academy of Sciences.

[7]  A. Ben-Tal,et al.  Evaluating the physiological significance of respiratory sinus arrhythmia: looking beyond ventilation–perfusion efficiency , 2012, The Journal of physiology.

[8]  Slade Matthews,et al.  Increased Total Heart Rate Variability and Enhanced Cardiac Vagal Autonomic Activity in Healthy Humans with Sinus Bradycardia , 2010, Proceedings.

[9]  Pablo Laguna,et al.  Computational techniques for ECG analysis and interpretation in light of their contribution to medical advances , 2018, Journal of The Royal Society Interface.

[10]  Cynthia Rudin,et al.  Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead , 2018, Nature Machine Intelligence.

[11]  J. Hancox,et al.  Recent advances in understanding sex differences in cardiac repolarization. , 2007, Progress in biophysics and molecular biology.

[12]  Rajarshi Gupta,et al.  Delineation of ECG characteristic features using multiresolution wavelet analysis method , 2012 .

[13]  J. Dimarco,et al.  The evaluation and management of bradycardia. , 2000, The New England journal of medicine.

[14]  Mangrum Jm,et al.  The Evaluation and Management of Bradycardia , 2000 .

[15]  Cyril Rakovski,et al.  A 12-lead electrocardiogram database for arrhythmia research covering more than 10,000 patients , 2020, Scientific Data.

[16]  Todd H. Stokes,et al.  k-Nearest neighbor models for microarray gene expression analysis and clinical outcome prediction , 2010, The Pharmacogenomics Journal.

[17]  Sasank Chilamkurthy,et al.  Deep learning algorithms for detection of critical findings in head CT scans: a retrospective study , 2018, The Lancet.

[18]  P. Kirchhof,et al.  2016 ESC Guidelines for the management of atrial fibrillation developed in collaboration with EACTS. , 2016, Europace : European pacing, arrhythmias, and cardiac electrophysiology : journal of the working groups on cardiac pacing, arrhythmias, and cardiac cellular electrophysiology of the European Society of Cardiology.

[19]  Dongsheng Che,et al.  Understanding the Wine Judges and Evaluating the Consistency Through White-Box Classification Algorithms , 2016, ICDM.

[20]  Octavio Loyola-González,et al.  Black-Box vs. White-Box: Understanding Their Advantages and Weaknesses From a Practical Point of View , 2019, IEEE Access.

[21]  C. Morillo,et al.  Mechanism of ‘Inappropriate’ Sinus Tachycardia: Role of Sympathovagal Balance , 1994, Circulation.

[22]  Ulrich Schotten,et al.  2016 ESC Guidelines for the Management of Atrial Fibrillation Developed in Collaboration With EACTS. , 2017, Revista espanola de cardiologia.

[23]  H. Wellens,et al.  Computer-Interpreted Electrocardiograms: Benefits and Limitations. , 2017, Journal of the American College of Cardiology.

[24]  Jinseok Lee,et al.  A novel application for the detection of an irregular pulse using an iPhone 4S in patients with atrial fibrillation. , 2013, Heart rhythm.

[25]  H. Crijns,et al.  Heart rate variability in patients with atrial fibrillation is related to vagal tone. , 1997, Circulation.

[26]  J. Farris CONJECTURES AND REFUTATIONS , 1995, Cladistics : the international journal of the Willi Hennig Society.

[27]  Mohammad Ali Tinati,et al.  Cardiac arrhythmia classification using statistical and mixture modeling features of ECG signals , 2016, Pattern Recognit. Lett..

[28]  Kandala N. V. P. S. Rajesh,et al.  Classification of ECG heartbeats using nonlinear decomposition methods and support vector machine , 2017, Comput. Biol. Medicine.

[29]  A. Storrow,et al.  Risk factors for bradycardia requiring pacemaker implantation in patients with atrial fibrillation. , 2012, The American journal of cardiology.

[30]  I. Christov,et al.  Ranking of the most reliable beat morphology and heart rate variability features for the detection of atrial fibrillation in short single-lead ECG , 2018, Physiological measurement.

[31]  W Nicholson Price,et al.  Big data and black-box medical algorithms , 2018, Science Translational Medicine.

[32]  John Zhang,et al.  Effect of age and sex on heart rate variability in healthy subjects. , 2007, Journal of manipulative and physiological therapeutics.

[33]  L. Ungar,et al.  MediBoost: a Patient Stratification Tool for Interpretable Decision Making in the Era of Precision Medicine , 2016, Scientific Reports.

[34]  A. Shah,et al.  Errors in the computerized electrocardiogram interpretation of cardiac rhythm. , 2007, Journal of electrocardiology.

[35]  Masoumeh Haghpanahi,et al.  Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network , 2019, Nature Medicine.

[36]  S Celin,et al.  ECG Signal Classification Using Various Machine Learning Techniques , 2018, Journal of Medical Systems.

[37]  S. Nattel New ideas about atrial fibrillation 50 years on , 2002, Nature.

[38]  R. Stewart,et al.  Ventricular rate and beat-to-beat variation of stroke volume in atrial fibrillation. , 2001, The American journal of cardiology.

[39]  P. Lambiase,et al.  Differences in the upslope of the precordial body surface ECG T wave reflect right to left dispersion of repolarization in the intact human heart , 2019, Heart rhythm.

[40]  Fei-Fei Li,et al.  Illuminating the dark spaces of healthcare with ambient intelligence , 2020, Nature.

[41]  D. Zyśko,et al.  The nature of P-wave dispersion - A clinically useful parameter that does not exist. , 2016, International journal of cardiology.

[42]  Hugh Calkins,et al.  2015 ACC/AHA/HRS guideline for the management of adult patients with supraventricular tachycardia: A Report of the American College of Cardiology/American Heart Association Task Force on Clinical Practice Guidelines and the Heart Rhythm Society. , 2016, Heart rhythm.

[43]  Rutvija Pandya,et al.  C5.0 Algorithm to Improved Decision Tree with Feature Selection and Reduced Error Pruning , 2015 .

[44]  Segyeong Joo,et al.  Prediction of Ventricular Tachycardia One Hour before Occurrence Using Artificial Neural Networks , 2016, Scientific Reports.

[45]  H. Friedman,et al.  Appearance of atrial rhythm with absent P wave in longstanding atrial fibrillation. , 1974, Chest.

[46]  J. Boineau,et al.  The human atrial pacemaker complex. , 1989, Journal of electrocardiology.

[47]  S. Goodacre,et al.  ABC of clinical electrocardiography: Atrial arrhythmias , 2002, BMJ : British Medical Journal.

[48]  L Glass,et al.  Automatic detection of atrial fibrillation using the coefficient of variation and density histograms of RR and ΔRR intervals , 2001, Medical and Biological Engineering and Computing.

[49]  Stephen W. Porges,et al.  Methodological issues in the quantification of respiratory sinus arrhythmia , 2007, Biological Psychology.

[50]  Junichiro Hayano,et al.  Exponential Distribution of Long Heart Beat Intervals During Atrial Fibrillation and Their Relevance for White Noise Behaviour in Power Spectrum , 2006, Journal of biological physics.

[51]  Raúl Alcaraz,et al.  A review on sample entropy applications for the non-invasive analysis of atrial fibrillation electrocardiograms , 2010, Biomed. Signal Process. Control..

[52]  Sejong Oh,et al.  Development of machine learning models for diagnosis of glaucoma , 2017, PloS one.

[53]  Gary James Jason,et al.  The Logic of Scientific Discovery , 1988 .

[54]  Subhashini Venugopalan,et al.  Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs. , 2016, JAMA.

[55]  H. Huikuri,et al.  Heart rate variability in inappropriate sinus tachycardia. , 1998, The American journal of cardiology.

[56]  Vili Podgorelec,et al.  Decision Trees: An Overview and Their Use in Medicine , 2002, Journal of Medical Systems.

[57]  K. Popper,et al.  Conjectures and refutations;: The growth of scientific knowledge , 1972 .