Performance analysis of machine learning algorithms on diabetes dataset using big data analytics

New Technologies such as Big Data and Cloud is playing a vital role in providing solutions to Healthcare problems. Now-a-days healthcare data is growing very drastically day-by-day and it requires an efficient, effective and timely solution to reduce the mortality rate. One of the most critical chronic healthcare problems is diabetes. In Long run, this problem may leads to damage eyes, heart, kidneys and nerves of diabetes patient if improper medication is done which also leads to death. The aim of this paper is to analyze and compare different machine learning algorithms to identify a best predicting algorithm based on various metrics such as accuracy, kappa, precision, recall, sensitivity and specificity. A comprehensive study is done on diabetes dataset with Random Forest (RF), SVM, k-NN, CART and LDA algorithms. The achieved results shows that RF is giving more accurate predictions with compared to other algorithms.