Icentia11K: An Unsupervised Representation Learning Dataset for Arrhythmia Subtype Discovery

We release the largest public ECG dataset of continuous raw signals for representation learning containing 11 thousand patients and 2 billion labelled beats. Our goal is to enable semi-supervised ECG models to be made as well as to discover unknown subtypes of arrhythmia and anomalous ECG signal events. To this end, we propose an unsupervised representation learning task, evaluated in a semi-supervised fashion. We provide a set of baselines for different feature extractors that can be built upon. Additionally, we perform qualitative evaluations on results from PCA embeddings, where we identify some clustering of known subtypes indicating the potential for representation learning in arrhythmia sub-type discovery.

[1]  Peter Szolovits,et al.  MIMIC-III, a freely accessible critical care database , 2016, Scientific Data.

[2]  Leon Glass,et al.  Predicting the risk of sudden cardiac death , 2016, The Journal of physiology.

[3]  P. Welch The use of fast Fourier transform for the estimation of power spectra: A method based on time averaging over short, modified periodograms , 1967 .

[4]  J. Tukey,et al.  An algorithm for the machine calculation of complex Fourier series , 1965 .

[5]  U. Rajendra Acharya,et al.  Arrhythmia detection using deep convolutional neural network with long duration ECG signals , 2018, Comput. Biol. Medicine.

[6]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[7]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[8]  M. Arthanari,et al.  ECG Feature Extraction Techniques - A Survey Approach , 2010, ArXiv.

[9]  Lars Kaderali,et al.  Can supervised learning be used to classify cardiac rhythms? , 2017, 2017 Computing in Cardiology (CinC).

[10]  Andrew Y. Ng,et al.  Cardiologist-Level Arrhythmia Detection with Convolutional Neural Networks , 2017, ArXiv.

[11]  Ernesto Iadanza,et al.  A convolutional neural network approach to detect congestive heart failure , 2020, Biomed. Signal Process. Control..

[12]  Pablo Laguna,et al.  The STAFF III database: ECGs recorded during acutely induced myocardial ischemia , 2017, 2017 Computing in Cardiology (CinC).

[13]  Wenyao Xu,et al.  Cardiac Scan: A Non-contact and Continuous Heart-based User Authentication System , 2017, MobiCom.

[14]  M. Turakhia,et al.  Diagnostic utility of a novel leadless arrhythmia monitoring device. , 2013, The American journal of cardiology.

[15]  Majid Sarrafzadeh,et al.  ECG Heartbeat Classification: A Deep Transferable Representation , 2018, 2018 IEEE International Conference on Healthcare Informatics (ICHI).

[16]  Sukhoon Lee,et al.  ECG-ViEW II, a freely accessible electrocardiogram database , 2017, PloS one.

[17]  Geoffrey E. Hinton Connectionist Learning Procedures , 1989, Artif. Intell..

[18]  Masoumeh Haghpanahi,et al.  Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network , 2019, Nature Medicine.

[19]  G.B. Moody,et al.  The impact of the MIT-BIH Arrhythmia Database , 2001, IEEE Engineering in Medicine and Biology Magazine.

[20]  A. Mincholé,et al.  Artificial intelligence for the electrocardiogram , 2019, Nature Medicine.

[21]  A. Shah,et al.  Errors in the computerized electrocardiogram interpretation of cardiac rhythm. , 2007, Journal of electrocardiology.

[22]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[23]  C.-C. Jay Kuo,et al.  ECG-based biometrics using recurrent neural networks , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[24]  John M. Irvine,et al.  Heartbeat biometrics: a sensing system perspective , 2012 .

[25]  Paul Schweitzer,et al.  Frequent Premature Ventricular Complexes Originating from the Right Ventricular Outflow Tract Are Associated with Left Ventricular Dysfunction , 2008, Annals of noninvasive electrocardiology : the official journal of the International Society for Holter and Noninvasive Electrocardiology, Inc.

[26]  M. Guglin,et al.  Common errors in computer electrocardiogram interpretation. , 2006, International journal of cardiology.

[27]  Jeffrey M. Hausdorff,et al.  Physionet: Components of a New Research Resource for Complex Physiologic Signals". Circu-lation Vol , 2000 .

[28]  Naftali Tishby,et al.  Deep learning and the information bottleneck principle , 2015, 2015 IEEE Information Theory Workshop (ITW).