Identifying the Mislabeled Training Samples of ECG Signals using Machine Learning

The classification accuracy of electrocardiogram signal is often affected by diverse factors in which mislabeled training samples issue is one of the most influential problems. In order to mitigate this negative effect, the method of cross validation is introduced to identify the mislabeled samples. The method utilizes the cooperative advantages of different classifiers to act as a filter for the training samples. The filter removes the mislabeled training samples and retains the correctly labeled ones with the help of 10-fold cross validation. Consequently, a new training set is provided to the final classifiers to acquire higher classification accuracies. Finally, we numerically show the effectiveness of the proposed method with the MIT-BIH arrhythmia database.

[1]  Jocelyne Fayn,et al.  A Classification Tree Approach for Cardiac Ischemia Detection Using Spatiotemporal Information From Three Standard ECG Leads , 2011, IEEE Transactions on Biomedical Engineering.

[2]  Mehmet Korürek,et al.  ECG beat classification using particle swarm optimization and radial basis function neural network , 2010, Expert Syst. Appl..

[3]  D. Coomans,et al.  Alternative k-nearest neighbour rules in supervised pattern recognition : Part 1. k-Nearest neighbour classification by using alternative voting rules , 1982 .

[4]  Shuenn-Yuh Lee,et al.  Low-Power Wireless ECG Acquisition and Classification System for Body Sensor Networks , 2015, IEEE Journal of Biomedical and Health Informatics.

[5]  Nello Cristianini,et al.  An introduction to Support Vector Machines , 2000 .

[6]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[7]  Zhi-Hua Zhou,et al.  A brief introduction to weakly supervised learning , 2018 .

[8]  Jieping Ye,et al.  Two-Dimensional Linear Discriminant Analysis , 2004, NIPS.

[9]  S. Osowski,et al.  Support Vector Machine based expert system for reliable heart beat recognition , 2022 .

[10]  Hae-Chang Rim,et al.  Some Effective Techniques for Naive Bayes Text Classification , 2006, IEEE Transactions on Knowledge and Data Engineering.

[11]  Sukumar Chakraborty,et al.  Fuzzy rule extraction from ID3-type decision trees for real data , 2001, IEEE Trans. Syst. Man Cybern. Part B.

[12]  J. Ross Quinlan,et al.  Improved Use of Continuous Attributes in C4.5 , 1996, J. Artif. Intell. Res..

[13]  Paul Geladi,et al.  Principal Component Analysis , 1987, Comprehensive Chemometrics.

[14]  Ahnaf Rashik Hassan,et al.  Computer-aided obstructive sleep apnea screening from single-lead electrocardiogram using statistical and spectral features and bootstrap aggregating , 2016 .

[15]  Dimitrios I. Fotiadis,et al.  An association rule mining-based methodology for automated detection of ischemic ECG beats , 2006, IEEE Transactions on Biomedical Engineering.

[16]  Qiao Li,et al.  AF classification from a short single lead ECG recording: The PhysioNet/computing in cardiology challenge 2017 , 2017, 2017 Computing in Cardiology (CinC).

[17]  Ian T. Jolliffe,et al.  Principal Component Analysis , 2002, International Encyclopedia of Statistical Science.

[18]  Wael Louis,et al.  Real-time heartbeat outlier removal in electrocardiogram (ECG) biometrie system , 2016, 2016 IEEE Canadian Conference on Electrical and Computer Engineering (CCECE).

[19]  Ahnaf Rashik Hassan,et al.  Computer-aided sleep apnea diagnosis from single-lead electrocardiogram using Dual Tree Complex Wavelet Transform and spectral features , 2015, 2015 International Conference on Electrical & Electronic Engineering (ICEEE).

[20]  Zhi-Hua Zhou,et al.  ML-KNN: A lazy learning approach to multi-label learning , 2007, Pattern Recognit..

[21]  Zedong Nie,et al.  A Wireless Biomedical Signal Interface System-on-Chip for Body Sensor Networks , 2010, IEEE Transactions on Biomedical Circuits and Systems.

[22]  Nai-Kuan Chou,et al.  ECG data compression using truncated singular value decomposition , 2001, IEEE Trans. Inf. Technol. Biomed..

[23]  Monira Islam,et al.  An approach of cardiac disease prediction by analyzing ECG signal , 2016, 2016 3rd International Conference on Electrical Engineering and Information Communication Technology (ICEEICT).

[24]  Johan A. K. Suykens,et al.  Support Vector Machines: A Nonlinear Modelling and Control Perspective , 2001, Eur. J. Control.

[25]  H. Ney,et al.  Linear discriminant analysis for improved large vocabulary continuous speech recognition , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[26]  Yuhong Yang,et al.  Cross-validation for selecting a model selection procedure , 2015 .

[27]  Ahnaf Rashik Hassan,et al.  Computer-aided obstructive sleep apnea identification using statistical features in the EMD domain and extreme learning machine , 2016 .

[28]  M. Verleysen,et al.  Classification in the Presence of Label Noise: A Survey , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[29]  A. Batra,et al.  Classification of Arrhythmia Using Conjunction of Machine Learning Algorithms and ECG Diagnostic Criteria , 2016 .

[30]  Gregory T. A. Kovacs,et al.  Robust Neural-Network-Based Classification of Premature Ventricular Contractions Using Wavelet Transform and Timing Interval Features , 2006, IEEE Transactions on Biomedical Engineering.

[31]  Sylvain Arlot,et al.  A survey of cross-validation procedures for model selection , 2009, 0907.4728.

[32]  Carla E. Brodley,et al.  Identifying Mislabeled Training Data , 1999, J. Artif. Intell. Res..

[33]  Farid Melgani,et al.  Classification of Electrocardiogram Signals With Support Vector Machines and Particle Swarm Optimization , 2008, IEEE Transactions on Information Technology in Biomedicine.

[34]  Moncef Gabbouj,et al.  A Generic and Robust System for Automated Patient-Specific Classification of ECG Signals , 2009, IEEE Transactions on Biomedical Engineering.

[35]  Chia-Hung Lin,et al.  Frequency-domain features for ECG beat discrimination using grey relational analysis-based classifier , 2008, Comput. Math. Appl..

[36]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[37]  Naif Alajlan,et al.  Deep learning approach for active classification of electrocardiogram signals , 2016, Inf. Sci..

[38]  Ahnaf Rashik Hassan,et al.  Automatic screening of Obstructive Sleep Apnea from single-lead Electrocardiogram , 2015, 2015 International Conference on Electrical Engineering and Information Communication Technology (ICEEICT).

[39]  Guido Dedene,et al.  A case study of applying boosting naive Bayes to claim fraud diagnosis , 2004, IEEE Transactions on Knowledge and Data Engineering.

[40]  Salvatore Ruggieri,et al.  Efficient C4.5 , 2002, IEEE Trans. Knowl. Data Eng..

[41]  J. van Alsté,et al.  Removal of Base-Line Wander and Power-Line Interference from the ECG by an Efficient FIR Filter with a Reduced Number of Taps , 1985, IEEE Transactions on Biomedical Engineering.

[42]  Farid Melgani,et al.  Genetic algorithm-based method for mitigating label noise issue in ECG signal classification , 2015, Biomed. Signal Process. Control..

[43]  Sergios Theodoridis,et al.  Pattern Recognition , 1998, IEEE Trans. Neural Networks.

[44]  G. Wischermann Median filtering of video signals : a powerful alternative , 1991 .

[45]  Yue Zhang,et al.  Classification of Electrocardiogram Signals with Deep Belief Networks , 2014, 2014 IEEE 17th International Conference on Computational Science and Engineering.

[46]  Ahnaf Rashik Hassan,et al.  Computer-aided obstructive sleep apnea detection using normal inverse Gaussian parameters and adaptive boosting , 2016, Biomed. Signal Process. Control..

[47]  U. Rajendra Acharya,et al.  ECG beat classification using PCA, LDA, ICA and Discrete Wavelet Transform , 2013, Biomed. Signal Process. Control..

[48]  Philip de Chazal,et al.  Automatic classification of heartbeats using ECG morphology and heartbeat interval features , 2004, IEEE Transactions on Biomedical Engineering.

[49]  James M. Keller,et al.  A fuzzy K-nearest neighbor algorithm , 1985, IEEE Transactions on Systems, Man, and Cybernetics.

[50]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[51]  Yogendra Kumar Jain,et al.  Min Max Normalization Based Data Perturbation Method for Privacy Protection , 2011 .

[52]  Hadi Sadoghi Yazdi,et al.  ECG Arrhythmia Classification with Support Vector Machines and Genetic Algorithm , 2009, 2009 Third UKSim European Symposium on Computer Modeling and Simulation.

[53]  Ahnaf Rashik Hassan,et al.  Identification of Sleep Apnea from Single-Lead Electrocardiogram , 2016, 2016 IEEE Intl Conference on Computational Science and Engineering (CSE) and IEEE Intl Conference on Embedded and Ubiquitous Computing (EUC) and 15th Intl Symposium on Distributed Computing and Applications for Business Engineering (DCABES).

[54]  Wei Chen,et al.  Mimo pillow : an intelligent cushion designed with maternal heart beat vibrations for comforting newborn infants , 2014 .

[55]  George Manis,et al.  Heartbeat Time Series Classification With Support Vector Machines , 2009, IEEE Transactions on Information Technology in Biomedicine.

[56]  Naif Alajlan,et al.  A wavelet optimization approach for ECG signal classification , 2012, Biomed. Signal Process. Control..

[57]  Theofanis Sapatinas,et al.  Discriminant Analysis and Statistical Pattern Recognition , 2005 .

[58]  Ahnaf Rashik Hassan,et al.  An expert system for automated identification of obstructive sleep apnea from single-lead ECG using random under sampling boosting , 2017, Neurocomputing.

[59]  Ahnaf Rashik Hassan,et al.  A comparative study of various classifiers for automated sleep apnea screening based on single-lead electrocardiogram , 2015, 2015 International Conference on Electrical & Electronic Engineering (ICEEE).