Air pollution analysis using enhanced K-Means clustering algorithm for real time sensor data

Air pollution affects body organs and human systems in addition to the environment. Smart air pollution monitoring consists of wireless sensor nodes, server and a database to store the monitored data. Huge amounts of data are generated by gas sensors in air pollution monitoring system. Traditional methods are too complex to process and analyze the voluminous data. The heterogeneous data are converted into meaningful information by using data mining approaches for decision making. The K-Means algorithm is one of the frequently used clustering method in data mining for clustering massive data sets. In this paper, enhanced K-Means clustering algorithm is proposed to analyze the air pollution data. The correlation coefficient is calculated from the real time monitored pollutant datasets. The Air Quality Index (AQI) value is calculated from the correlation co-efficient to determine the air pollution level in a particular place. The proposed enhanced K-Means clustering algorithm is compared with Possibilistic Fuzzy C-Means (PFCM) clustering algorithm in terms of accuracy and execution time. Experimental results show that the proposed enhanced K-Means clustering algorithm gives AQI value in higher accuracy with less execution time for when compared to existing techniques.

[1]  James M. Keller,et al.  A possibilistic fuzzy c-means clustering algorithm , 2005, IEEE Transactions on Fuzzy Systems.

[2]  Vlad Isakov,et al.  Analysis of air quality data near roadways using a dispersion model , 2007 .

[3]  Muhammad Atif Jamil,et al.  Smart Environment Monitoring System by Employing Wireless Sensor Networks on Vehicles for Pollution Free Smart Cities , 2015 .

[4]  Gülsen Aydin Keskin,et al.  Using principal component analysis and fuzzy c–means clustering for the assessment of air quality monitoring , 2014 .

[5]  Ramzi A. Haraty,et al.  An Enhanced k-Means Clustering Algorithm for Pattern Discovery in Healthcare Data , 2015, Int. J. Distributed Sens. Networks.

[6]  Aruna Bhat,et al.  POSSIBILITY FUZZY C-MEANS CLUSTERING FOR EXPRESSION INVARIANT FACE RECOGNITION , 2014 .

[7]  Dong hui-Shi Online Test and Simulation Training Based on Three-tier Structure , 2011 .

[8]  D Andina,et al.  Air pollution analysis with a PFCM clustering algorithm applied in a real database of Salamanca (Mexico) , 2010, 2010 IEEE International Conference on Industrial Technology.

[9]  Ali Dastfan,et al.  K-means clustering and correlation coefficient based methods for detection of flicker sources in non-radial power system , 2014 .

[10]  Dasimah Omar,et al.  Urban Air Quality and Human Health Effects in Selangor, Malaysia☆ , 2015 .

[11]  Pavel Berkhin,et al.  A Survey of Clustering Data Mining Techniques , 2006, Grouping Multidimensional Data.

[12]  Naomi Zimmerman,et al.  Plume-based analysis of vehicle fleet air pollutant emissions and the contribution from high emitters , 2015 .

[13]  Abbas Alimohammadi,et al.  HADOOP-BASED DISTRIBUTED SYSTEM FOR ONLINE PREDICTION OF AIR POLLUTION BASED ON SUPPORT VECTOR MACHINE , 2015 .

[14]  Moustafa Ghanem,et al.  Air Pollution Monitoring and Mining Based on Sensor Grid in London , 2008 .