Big Data Analysis and Classification of Biomedical Signal Using Random Forest Algorithm

The healthcare industries generate a huge amount of data due to computer-aided diagnosis system. In this paper authors have analyzed these huge amounts of data for the classification of the ECG signals using random forest algorithm. A four-step classification model is designed in this work. First, the raw ECG signals are filtered using the median filter and a 12 order low-pass filter. Then the features from the clean ECG signals are extracted using wavelet transform. These wavelet features are then taken for the classification using random forest algorithm. The proposed random forest classification model is giving around 87% classification accuracy which is quite good. This method can be applied in the medical sector for early detection and better diagnosis of any disease.

[1]  Ming Chen,et al.  Big Data Analytics in Medicine and Healthcare , 2018, J. Integr. Bioinform..

[2]  Serafín Moral,et al.  Increasing diversity in random forest learning algorithm via imprecise probabilities , 2018, Expert Syst. Appl..

[3]  Mihir Narayan Mohanty,et al.  Analysis of Resampling Method for Arrhythmia Classification Using Random Forest Classifier with Selected Features , 2018, 2018 2nd International Conference on Data Science and Business Analytics (ICDSBA).

[4]  Peter J. Hunter,et al.  Big Data, Big Knowledge: Big Data for Personalized Healthcare , 2015, IEEE Journal of Biomedical and Health Informatics.

[5]  Jasmin Kevric,et al.  Performance evaluation of empirical mode decomposition, discrete wavelet transform, and wavelet packed decomposition for automated epileptic seizure detection and prediction , 2018, Biomed. Signal Process. Control..

[6]  Andrej Trnka Big-Data Analysis , 2018, Encyclopedia of Social Network Analysis and Mining. 2nd Ed..

[7]  Zachary M. Jones,et al.  edarf: Exploratory Data Analysis using Random Forests , 2016, J. Open Source Softw..

[8]  Akin Ozçift,et al.  Random forests ensemble classifier trained with data resampling strategy to improve cardiac arrhythmia diagnosis. , 2011, Computers in biology and medicine.

[9]  Yichuan Wang,et al.  An integrated big data analytics-enabled transformation model: Application to health care , 2018, Inf. Manag..

[10]  Arif Gülten,et al.  Classifier ensemble construction with rotation forest to improve medical diagnosis performance of machine learning algorithms , 2011, Comput. Methods Programs Biomed..

[11]  Xiaohui Peng,et al.  A novel random forests based class incremental learning method for activity recognition , 2018, Pattern Recognit..

[12]  Jean-Michel Poggi,et al.  Random Forests for Big Data , 2015, Big Data Res..

[13]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[14]  G.B. Moody,et al.  The impact of the MIT-BIH Arrhythmia Database , 2001, IEEE Engineering in Medicine and Biology Magazine.

[15]  Giuseppe De Pietro,et al.  A revised scheme for real time ECG Signal denoising based on recursive filtering , 2016, Biomed. Signal Process. Control..

[16]  Giovanni Montana,et al.  Random forest regression for manifold-valued responses , 2017, Pattern Recognit. Lett..

[17]  Mihir Narayan Mohanty,et al.  Detection of Arrhythmia using Neural Network , 2017, ICITKM.

[18]  Rajkumar Buyya,et al.  Introduction to Cloud Computing , 2011, CloudCom 2011.