Data Mining Techniques for the Estimation of Variables in Health-Related Noisy Data

Public health in developed countries is heavily affected by pollution specially in highly populated areas. Amongst the pollutants with greatest impact in health, ozone is particularly addressed in this paper due to importance of its effect on cardiovascular and respiratory problems and their prevalence on developed societies. Local authorities are compelled to provide satisfactory predictions of ozone levels and thus the need of proper estimation tools rises. A data driven approach to prediction demands high quality data but those observations collected by weather stations usually fail to meet this requirement. This paper reports a new approach to robust ozone levels prediction by using an outlier detection technique in an innovative way. The aim is to assess the feasibility of using raw data without preprocessing in order to obtain similar or better results than with traditional outlier removal techniques. An experimental dataset from a location in Spain, Ponferrada, is used through an experimental stage in which such approach provides satisfactory results in a difficult case.

[1]  Joaquín B. Ordieres Meré,et al.  Prediction of daily maximum ozone threshold exceedances by preprocessing and ensemble artificial intelligence techniques , 2016 .

[2]  J. Sliggers Convention on Long-Range Transboundary Air Pollution , 2011 .

[3]  Joel Schwartz,et al.  Acute effects of ozone on mortality from the "air pollution and health: a European approach" project. , 2004, American journal of respiratory and critical care medicine.

[4]  José David Martín-Guerrero,et al.  Neural networks for analysing the relevance of input variables in the prediction of tropospheric ozone concentration , 2006 .

[5]  Héctor Alaiz-Moretón,et al.  Coupling the PAELLA Algorithm to Predictive Models , 2017, SOCO-CISIS-ICEUTE.

[6]  Zahra Ramezani,et al.  Levels and sources of BTEX in ambient air of Ahvaz metropolitan city , 2014, Air Quality, Atmosphere & Health.

[7]  Yves Candau,et al.  Hourly ozone prediction for a 24-h horizon using neural networks , 2008, Environ. Model. Softw..

[8]  E Salazar-Ruiz,et al.  メキシカリ,バヤカリフォルニア(メキシコ)とカレキシコ,カリフォルニア(アメリカ)における直線と人工知能モデルを用いて対流圏オゾン予測モデルの開発と比較分析 , 2008 .

[9]  Kim Oanh,et al.  Photochemical smog introduction and episode selection for the ground-level ozone in Hanoi, Vietnam , 2008 .

[10]  Nirvana Meratnia,et al.  Outlier Detection Techniques for Wireless Sensor Networks: A Survey , 2008, IEEE Communications Surveys & Tutorials.

[11]  Joseph P. Romano,et al.  Resurrecting Weighted Least Squares , 2016 .

[12]  Petr Hájek,et al.  Ozone prediction on the basis of neural networks, support vector regression and methods with uncertainty , 2012, Ecol. Informatics.

[13]  Stefan Winkler,et al.  A data-driven approach to cleaning large face datasets , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[14]  Manuel Castejón Limas,et al.  Outlier Detection and Data Cleaning in Multivariate Non-Normal Samples: The PAELLA Algorithm , 2004, Data Mining and Knowledge Discovery.

[15]  Sanjay Rajagopalan,et al.  Ozone-induced Metabolic Effects in Humans. Ieiunium, Conviviorum, aut Timor? (Fasting, Feasting, or Fear?). , 2016, American journal of respiratory and critical care medicine.

[16]  Shinji Wakamatsu,et al.  A Comparative Study of Urban Air Quality in Megacities in Mexico and Japan: Based on Japan-Mexico Joint Research Project on Formation Mechanism of Ozone, VOCs and PM2.5, and Proposal of Countermeasure Scenario , 2017 .

[17]  Joaquín B. Ordieres Meré,et al.  Development and comparative analysis of tropospheric ozone prediction models using linear and artificial intelligence-based models in Mexicali, Baja California (Mexico) and Calexico, California (US) , 2008, Environ. Model. Softw..

[18]  Joaquín B. Ordieres Meré,et al.  Neural network prediction model for fine particulate matter (PM2.5) on the US-Mexico border in El Paso (Texas) and Ciudad Juárez (Chihuahua) , 2005, Environ. Model. Softw..

[19]  F G Martins,et al.  Health effects of ozone focusing on childhood asthma: what is now known--a review from an epidemiological point of view. , 2013, Chemosphere.

[20]  Joaquín B. Ordieres Meré,et al.  Prediction models for ozone in metropolitan area of Mexico City based on artificial intelligence techniques , 2015, Int. J. Inf. Decis. Sci..

[21]  Senthamarai Kannan Kaliyaperumal,et al.  Outlier Detection and Missing Value in Time Series Ozone Data , 2015 .

[22]  Joaquín B. Ordieres Meré,et al.  Prediction of daily maximum ozone threshold exceedances by preprocessing and ensemble artificial intelligence techniques: Case study of Hong Kong , 2016, Environ. Model. Softw..

[23]  Joel Schwartz,et al.  Acute effects of ozone on mortality from the : a European approach , 2004 .

[24]  J F Bedi,et al.  Respiratory responses in humans repeatedly exposed to low concentrations of ozone. , 1980, American Review of Respiratory Disease.

[25]  Wei-Zhen Lu,et al.  Ground-level ozone prediction by support vector machine approach with a cost-sensitive classification scheme. , 2008, The Science of the total environment.

[26]  Mohammad Javad Mohammadi,et al.  Cardiovascular and respiratory mortality attributed to ground-level ozone in Ahvaz, Iran , 2015, Environmental Monitoring and Assessment.