Predicting the risk of readmission of diabetic patients using MapReduce

From the banking to retail, many sectors have already embraced big data regardless of whether the information comes from public or private sources. In the clinical sphere, the amount of patient data has grown exponentially because of computer based information systems. E-Health monitoring applications have some particularities concerning the importance on data quality. This paper presents a novel solution using Hadoop Mapreduce to analyze large datasets and extract useful insights from the dataset which helps doctors to effectively allocate resources. The successful healthcare delivery and planning strongly rely on data (e.g. sensed data, diagnosis, administration information); the higher quality of the data, the better will be the patient assistance. The applications are also particularly exposed to a contextual environment (i.e., patient's mobility, communication technologies, performance, information heterogeneity, etc.) that has an important impact on information management and application achievement. The main objective of our system is to predict the risk of diabetic patients for readmission in the next 30 days by measuring the probability using MapReduce. This risk score helps the physicians in recommending appropriate care for the patients.