Ensemble classifier with Random Forest algorithm to deal with imbalanced healthcare data

In day today life, data is generated in massive amount with rapid growth in health care environment. The medical industries have large amount of data sets, for diagnosis purpose and maintain patient's records. The medical researches come with new treatments and medicine every day. But availability of medical datasets is often not balanced in their class labels. The performance of some existing method is poor on imbalanced dataset. So the prediction of disease from imbalanced data becomes difficult to handle. In this proposal Classifier ensemble method (Random Forest algorithm) can be used to overcome existing classifier techniques. Multiple classifier system is more accurate and robust than an existing classifier technique. The ensemble method proves to be very efficient in classification of records from available imbalanced healthcare patient data, as it involves the process of considering opinion from multiple base classifiers, as opposed to the single classifier method. This method gives a very accurate and precise inference, as unrelated data's are removed because of multiple base classifiers. The problems of healthcare dataset especially with some uncertainty can be predicted.

[1]  Venkatesh Saligrama,et al.  Prediction of hospitalization due to heart diseases by supervised learning methods , 2015, Int. J. Medical Informatics.

[2]  K. Hanumantha Rao,et al.  Implementation of Anomaly Detection Technique Using Machine Learning Algorithms , 2011 .

[3]  Dr. S. Vijayarani,et al.  Liver Disease Prediction using SVM and Naïve Bayes Algorithms , 2015 .

[4]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[5]  B. Krawczyk,et al.  Ensemble fusion methods for medical data classification , 2012, 11th Symposium on Neural Network Applications in Electrical Engineering.

[6]  Rob Stocker,et al.  Applying k-Nearest Neighbour in Diagnosing Heart Disease Patients , 2012 .

[7]  Gia Toan Nguyen,et al.  An Approach to Data Mining in Healthcare: Improved K-means Algorithm , 2013 .

[8]  Aboul Ella Hassanien,et al.  Hybrid system for lymphatic diseases diagnosis , 2013, 2013 International Conference on Advances in Computing, Communications and Informatics (ICACCI).