Health data analytics using scalable logistic regression with stochastic gradient descent

As wearable medical sensors continuously generate enormous data, it is difficult to process and analyse. This paper focuses on developing scalable sensor data processing architecture in cloud computing to store and process body sensor data for healthcare applications. Proposed architecture uses big data technologies such as Apache Flume, Apache Pig and Apache HBase to collect and store huge sensor data in the Amazon web service. Apache Mahout implementation of MapReduce-based online stochastic gradient descent algorithm is used in the logistic regression to develop the scalable diagnosis model. Cleveland heart disease database (CHDD) is used to train the logistic regression model. Wearable body sensors are used to get the blood pressure, blood sugar level and heart rate of the patient to predict the heart disease status. Proposed prediction model efficiently classifies the heart disease with the accuracy of training and validation sample is 81.99% and 81.52%, respectively.

[1]  Gang-hoon Kim,et al.  Potentiality of Big Data in the Medical Sector: Focus on How to Reshape the Healthcare System , 2013, Healthcare informatics research.

[2]  Gunasekaran Manogaran,et al.  MetaCloudDataStorage Architecture for Big Data Security in Cloud Computing , 2016 .

[3]  Athanasios V. Vasilakos,et al.  Body Area Networks: A Survey , 2010, Mob. Networks Appl..

[4]  Georgios Mantas,et al.  A New Framework Architecture for Next Generation e-Health Services , 2013, IEEE Journal of Biomedical and Health Informatics.

[5]  Dongyeop Kang,et al.  Data/Feature Distributed Stochastic Coordinate Descent for Logistic Regression , 2014, CIKM.

[6]  Ingrid Moerman,et al.  A survey on wireless body area networks , 2011, Wirel. Networks.

[7]  Patricia A. H. Williams,et al.  Big data in healthcare: What is it used for? , 2014 .

[8]  Wei Wang,et al.  A Survey of Body Sensor Networks , 2013, Sensors.

[9]  D. Lopez,et al.  Climate change and disease dynamics - A big data perspective , 2016 .

[10]  Daphne Lopez,et al.  Middleware for Preserving Privacy in Big Data , 2014 .

[11]  Rita Paradiso,et al.  A wearable health care system based on knitted integrated sensors , 2005, IEEE Transactions on Information Technology in Biomedicine.

[12]  Nikolaos G. Bourbakis,et al.  A Survey on Wearable Sensor-Based Systems for Health Monitoring and Prognosis , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[13]  Daphne Lopez,et al.  Assessment of Vaccination Strategies Using Fuzzy Multi-criteria Decision Making , 2015 .

[14]  D. Bates,et al.  Big data in health care: using analytics to identify and manage high-risk and high-cost patients. , 2014, Health affairs.

[15]  V. Vaidehi,et al.  Cloud-enabled remote health monitoring system , 2013, 2013 International Conference on Recent Trends in Information Technology (ICRTIT).

[16]  Abdellah Chehri,et al.  A Smart Network Architecture for e-Health Applications , 2010 .

[17]  Harpreet Kaur,et al.  Spatial big data analytics of influenza epidemic in Vellore, India , 2014, 2014 IEEE International Conference on Big Data (Big Data).

[18]  Jemal H. Abawajy,et al.  Privacy models for big data: a survey , 2015, Int. J. Big Data Intell..

[19]  Javier Bajo,et al.  Using Heterogeneous Wireless Sensor Networks in a Telemonitoring System for Healthcare , 2010, IEEE Transactions on Information Technology in Biomedicine.