In-Mapper combiner based MapReduce algorithm for processing of big climate data

Abstract Big data refers to a collection of massive volume of data that cannot be processed by conventional data processing tools and technologies. In recent years, the data production sources are enlarged noticeably, such as high-end streaming devices, wireless sensor networks, satellite, wearable Internet of Things (IoT) devices. These data generation sources generate a massive volume of data in a continuous manner. The large volume of climate data is collected from the IoT weather sensor devices and NCEP. In this paper, the big data processing framework is proposed to integrate climate and health data and to find the correlation between the climate parameters and incidence of dengue. This framework is demonstrated with the help of MapReduce programming model, Hive, HBase and ArcGIS in a Hadoop Distributed File System (HDFS) environment. The following weather parameters such as minimum temperature, maximum temperature, wind, precipitation, solar and relative humidity are collected for the study are Tamil Nadu with the help of IoT weather sensor devices and NCEP. Proposed framework focuses only on climate data for 32 districts of Tamil Nadu where each district contains 1,57,680 rows and so there are 50,45,760 rows in total. Batch view precomputation for the monthly mean of various climate parameters would require 50,45,760 rows. Hence, this would create more latency in query processing. In order to overcome this issue, batch views can precompute for a smaller number of records and involve more computation to be done at query time. The In-Mapper based MapReduce framework is used to compute the monthly mean of climate parameter for each latitude and longitude. The experimental results prove the effectiveness of the response time for the In-Mapper based combiner algorithm is less when compared with the existing MapReduce algorithm.

[1]  Xiong Li,et al.  A three-factor anonymous authentication scheme for wireless sensor networks in internet of things environments , 2018, J. Netw. Comput. Appl..

[2]  Silvana Trimi,et al.  Big-data applications in the government sector , 2014, Commun. ACM.

[3]  Yu Tian,et al.  Design and Development of a Medical Big Data Processing System Based on Hadoop , 2015, Journal of Medical Systems.

[4]  Fan Wu,et al.  A Robust ECC-Based Provable Secure Authentication Protocol With Privacy Preserving for Industrial Internet of Things , 2018, IEEE Transactions on Industrial Informatics.

[5]  Kyung Sup Kwak,et al.  Medical Applications of Wireless Body Area Networks , 2009, J. Digit. Content Technol. its Appl..

[6]  John Klein,et al.  Distribution, Data, Deployment: Software Architecture Convergence in Big Data Systems , 2015, IEEE Software.

[7]  Ranga Raju Vatsavai,et al.  Spatiotemporal data mining in the era of big spatial data: algorithms and applications , 2012, BigSpatial '12.

[8]  Wenwen Li,et al.  Constructing gazetteers from volunteered Big Geo-Data based on Hadoop , 2013, Comput. Environ. Urban Syst..

[9]  Ahmed Eldawy,et al.  Experimental evaluation of selectivity estimation on big spatial data , 2017, GeoRich '17.

[10]  Ching-Hsien Hsu,et al.  Machine Learning Based Big Data Processing Framework for Cancer Diagnosis Using Hidden Markov Model and GM Clustering , 2017, Wireless Personal Communications.

[11]  Xiong Li,et al.  A robust biometrics based three-factor authentication scheme for Global Mobility Networks in smart city , 2017, Future Gener. Comput. Syst..

[12]  Yuxin Zhu Global Climate Change Studying Based on Big Data Analysis of Antarctica , 2017 .

[13]  D. Bates,et al.  Big data in health care: using analytics to identify and manage high-risk and high-cost patients. , 2014, Health affairs.

[14]  Ibrar Yaqoob,et al.  Big IoT Data Analytics: Architecture, Opportunities, and Open Research Challenges , 2017, IEEE Access.

[15]  Ganesh Chandra Deka,et al.  Big Data Architecture for Climate Change and Disease Dynamics , 2016 .

[16]  Gunasekaran Manogaran,et al.  A Gaussian process based big data processing framework in cluster computing environment , 2017, Cluster Computing.

[17]  Gunasekaran Manogaran,et al.  A new architecture of Internet of Things and big data ecosystem for secured smart healthcare monitoring and alerting system , 2017, Future Gener. Comput. Syst..

[18]  John L. Schnase,et al.  MERRA Analytic Services: Meeting the Big Data challenges of climate science through cloud-enabled Climate Analytics-as-a-Service , 2013, Comput. Environ. Urban Syst..

[19]  Stefano Nativi,et al.  Big Data challenges in building the Global Earth Observation System of Systems , 2015, Environ. Model. Softw..

[20]  Viju Raghupathi,et al.  Big data analytics in healthcare: promise and potential , 2014, Health Information Science and Systems.

[21]  Bryan C. Pijanowski,et al.  A big data urban growth simulation at a national scale: Configuring the GIS and neural network based Land Transformation Model to run in a High Performance Computing (HPC) environment , 2014, Environ. Model. Softw..

[22]  Gunasekaran Manogaran,et al.  Disease Surveillance System for Big Climate Data Processing and Dengue Transmission , 2017, Int. J. Ambient Comput. Intell..

[23]  Carlos E. Cuesta,et al.  The Solid architecture for real-time management of big semantic data , 2015, Future Gener. Comput. Syst..

[24]  Gang-hoon Kim,et al.  Potentiality of Big Data in the Medical Sector: Focus on How to Reshape the Healthcare System , 2013, Healthcare informatics research.

[25]  Ping Wang,et al.  Real-Time Big Data Processing Framework: Challenges and Solutions , 2015 .

[26]  Gunasekaran Manogaran,et al.  Centralized Fog Computing Security Platform for IoT and Cloud in Healthcare System , 2018 .

[27]  Zhenlong Li,et al.  A high performance query analytical framework for supporting data-intensive climate studies , 2017, Comput. Environ. Urban Syst..

[28]  M. Vijayalakshmi,et al.  Big Data analytics frameworks , 2014, 2014 International Conference on Circuits, Systems, Communication and Information Technology Applications (CSCITA).

[29]  Dylan B. George,et al.  Big Data Opportunities for Global Infectious Disease Surveillance , 2013, PLoS medicine.

[30]  Xiao Zhi Gao,et al.  An adaptive decision based kriging interpolation algorithm for the removal of high density salt and pepper noise in images , 2017, Comput. Electr. Eng..

[31]  D. Lopez,et al.  Climate change and disease dynamics - A big data perspective , 2016 .

[32]  Hannu Tenhunen,et al.  End-to-end security scheme for mobility enabled healthcare Internet of Things , 2016, Future Gener. Comput. Syst..

[33]  Xiong Li,et al.  Anonymous mutual authentication and key agreement scheme for wearable sensors in wireless body area networks , 2017, Comput. Networks.

[34]  Gunasekaran Manogaran,et al.  Health data analytics using scalable logistic regression with stochastic gradient descent , 2018, Int. J. Adv. Intell. Paradigms.

[35]  Francisco Vilar Brasileiro,et al.  Big data analytics for climate change and biodiversity in the EUBrazilCC federated cloud infrastructure , 2015, Conf. Computing Frontiers.

[36]  Nathan Marz,et al.  Big Data: Principles and best practices of scalable realtime data systems , 2015 .

[37]  Jian Shen,et al.  A secure chaotic map-based remote authentication scheme for telecare medicine information systems , 2017, Future Gener. Comput. Syst..

[38]  Gunasekaran Manogaran,et al.  Wearable sensor devices for early detection of Alzheimer disease using dynamic time warping algorithm , 2018, Cluster Computing.

[39]  Andy Haines,et al.  Development of a browser application to foster research on linking climate and health datasets: Challenges and opportunities. , 2017, The Science of the total environment.

[40]  Gunasekaran Manogaran,et al.  Big Data Knowledge System in Healthcare , 2017 .

[41]  Jae-Gil Lee,et al.  Geospatial Big Data: Challenges and Opportunities , 2015, Big Data Res..

[42]  Michel Krämer,et al.  A modular software architecture for processing of big geospatial data in the cloud , 2015, Comput. Graph..

[43]  Gunasekaran Manogaran,et al.  RETRACTED ARTICLE: A big data classification approach using LDA with an enhanced SVM method for ECG signals in cloud computing , 2017, Multimedia Tools and Applications.

[44]  Gunasekaran Manogaran,et al.  Visual analysis of geospatial habitat suitability model based on inverse distance weighting with paired comparison analysis , 2017, Multimedia Tools and Applications.

[45]  A. Budden,et al.  Big data and the future of ecology , 2013 .

[46]  Rajiv Chakravorty,et al.  A programmable service architecture for mobile medical care , 2006, Fourth Annual IEEE International Conference on Pervasive Computing and Communications Workshops (PERCOMW'06).

[47]  Mohammad Masdari,et al.  Comprehensive analysis of the authentication methods in wireless body area networks , 2016, Secur. Commun. Networks.

[48]  Anurag Agarwal,et al.  The Internet of Things—A survey of topics and trends , 2015, Inf. Syst. Frontiers.

[49]  Samaher Al-Janabi,et al.  Survey of main challenges (security and privacy) in wireless body area networks for healthcare applications , 2017 .

[50]  Gunasekaran Manogaran,et al.  Spatial cumulative sum algorithm with big data analytics for climate change detection , 2017, Comput. Electr. Eng..

[51]  G. Usha Devi,et al.  Energy efficient node selection algorithm based on node performance index and random waypoint mobility model in internet of vehicles , 2017, Cluster Computing.

[52]  Gunasekaran Manogaran,et al.  Modelling the H1N1 influenza using mathematical and neural network approaches , 2017 .

[53]  James H. Faghmous,et al.  A Big Data Guide to Understanding Climate Change: The Case for Theory-Guided Data Science , 2014, Big Data.