Reality mining and predictive analytics for building smart applications

BackgroundMobile phone and sensors have become very useful to understand and analyze human lifestyle because of the huge amount of data they can collect every second. This triggered the idea of combining benefits and advantages of reality mining, machine learning and big data predictive analytics tools, applied to smartphones/sensors real time. The main goal of our study is to build a system that interacts with mobile phones and wearable healthcare sensors to predict patterns.MethodsWearable healthcare sensors (heart rate sensor, temperature sensor and activity sensor) and mobile phone are used for gathering real time data. All sensors are managed using IoT systems; we used Arduino for collecting data from health sensors and Raspberry Pi 3 for programming and processing. Kmeans clustering algorithm is used for patterns prediction and predicted clusters (partitions) are transmitted to the user in his front-end interface in the mobile application. Real world data and clustering validation statistics (Elbow method and Silhouette method) are used to validate the proposed system and assess its performance and effectiveness. All data management and processing tasks are conducted over Apache Spark Databricks.ResultsThis system relies on real time gathered data and can be applied to any prediction case making use of sensors and mobile generated data. As a proof of concept, we worked on predicting miscarriages to help pregnant women make quick decisions in case of miscarriage or probable miscarriage by creating a real time system prediction of miscarriage using wearable healthcare sensors, mobile tools, data mining algorithms and big data technologies. 9 risk factors contribute vastly in prediction, the Elbow method asserts that the optimal number of cluster is 2 and we achieve a higher value (0, 95) of Silhouette width that validates the good matching between clusters and observations. K-means algorithm gives good results in clustering the data.

[1]  Claire Cardie,et al.  Constrained K-means Clustering with Background Knowledge , 2001, ICML.

[2]  J. Tapanainen,et al.  High and low BMI increase the risk of miscarriage after IVF/ICSI and FET. , 2008, Human reproduction.

[3]  J. Thayer,et al.  eview meta-analysis of heart rate variability and neuroimaging studies : Implications or heart rate variability as a marker of stress and health , 2012 .

[4]  R. Ezhilarasie,et al.  Live Migration of Virtual Machines in Cloud Environment: A Survey , 2015 .

[5]  A. Kusiak,et al.  Short-Term Prediction of Wind Farm Power: A Data Mining Approach , 2009, IEEE Transactions on Energy Conversion.

[6]  Andrew K. Dennis Raspberry Pi Home Automation with Arduino , 2013 .

[7]  Shaaban Abdallah,et al.  NUMERICAL ASSESSMENT OF THE BACKWARD FACING STEPS NOZZLE , 2015 .

[8]  Shermin Sultana,et al.  A SMART , LOCATION BASED TIME AND ATTENDANCE TRACKING SYSTEM USING ANDROID APPLICATION , 2015 .

[9]  Sanjay Chakraborty,et al.  Weather Forecasting using Incremental K-means Clustering , 2014, ArXiv.

[10]  Guy N. Brock,et al.  clValid , an R package for cluster validation , 2008 .

[11]  Anusha Bharat,et al.  Using Machine Learning algorithms for breast cancer risk prediction and diagnosis , 2018, 2018 3rd International Conference on Circuits, Control, Communication and Computing (I4C).

[12]  Hajar Mousannif,et al.  Comprehensive miscarriage dataset for an early miscarriage prediction , 2018, Data in brief.

[13]  T. Chai,et al.  Root mean square error (RMSE) or mean absolute error (MAE)? – Arguments against avoiding RMSE in the literature , 2014 .

[14]  Malika Charrad,et al.  NbClust: An R Package for Determining the Relevant Number of Clusters in a Data Set , 2014 .

[15]  K. Nicolaides,et al.  Prediction of stillbirth from biochemical and biophysical markers at 11–13 weeks , 2016, Ultrasound in obstetrics & gynecology : the official journal of the International Society of Ultrasound in Obstetrics and Gynecology.

[16]  Hajar Mousannif,et al.  Real-time Miscarriage Prediction with SPARK , 2017, EUSPN/ICTH.

[17]  D. Lazer,et al.  The Parable of Google Flu: Traps in Big Data Analysis , 2014, Science.

[18]  Hajar Mousannif,et al.  Big data in healthcare: Challenges and opportunities , 2015, 2015 International Conference on Cloud Technologies and Applications (CloudTech).

[19]  Arpit Bansal,et al.  Improved K-mean Clustering Algorithm for Prediction Analysis using Classification Technique in Data Mining , 2017 .

[20]  Katharine Armstrong,et al.  Big data: a revolution that will transform how we live, work, and think , 2014 .

[21]  Özgür Ulusoy,et al.  A data mining approach for location prediction in mobile environments , 2005, Data Knowl. Eng..

[22]  Thorsten Joachims,et al.  Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[23]  Hirofumi Tanaka,et al.  Age-predicted maximal heart rate revisited. , 2001, Journal of the American College of Cardiology.

[24]  Hiroshi Itsumura,et al.  Book Recommendation Based on Library Loan Records and Bibliographic Information , 2014 .

[25]  Sergios Theodoridis,et al.  Pattern Recognition & Matlab Intro , 2010 .

[26]  O. Launay,et al.  [Influenza infection and pregnancy]. , 2013, Presse medicale.

[27]  Viktor Mayer-Schnberger,et al.  Big Data: A Revolution That Will Transform How We Live, Work, and Think , 2013 .

[28]  Myung-Gil Jang,et al.  A Modified Fixed‐Threshold SMO for 1‐Slack Structural SVMs , 2010 .

[29]  Olusanya Y. Agunbiade,et al.  Integration of a city GIS data with Google Map API and Google Earth API for a web based 3D Geospatial Application , 2013, ArXiv.

[30]  David W. Stewart Internet Research Methods , 2005 .

[31]  Li Xiu,et al.  Application of data mining techniques in customer relationship management: A literature review and classification , 2009, Expert Syst. Appl..