Prediction of Air Pollution through Machine Learning Approaches on the Cloud

Prediction of pollution is an increasingly important problem. It can impact individuals and their health, e.g. asthma patients can be greatly affected by air pollution. Traditional air pollution prediction methods have limitations. Machine learning provides one approach that can offer new opportunities for prediction of air pollution. There are however many different machine learning approaches and identifying the best one for the problem at hand is often challenging. In this paper air pollution data, specifically particulate matter of less than 2.5 micrometers (PM2.5) was collected from a variety of web-based resources and following, data cleansing analysed with different machine learning models including linear regression, Artificial Neural Networks and Long Short Term Memory recurrent neural networks. We consider the accuracy and the ability of these different models to predict unhealthy levels of pollution. The advantages and disadvantages of these models are also discussed.