Fusion of meteorological and air quality data extracted from the web for personalized environmental information services

There is a large amount of meteorological and air quality data available online. Often, different sources provide deviating and even contradicting data for the same geographical area and time. This implies that users need to evaluate the relative reliability of the information and then trust one of the sources. We present a novel data fusion method that merges the data from different sources for a given area and time, ensuring the best data quality. The method is a unique combination of land-use regression techniques, statistical air quality modelling and a well-known data fusion algorithm. We show experiments where a fused temperature forecast outperforms individual temperature forecasts from several providers. Also, we demonstrate that the local hourly NO2 concentration can be estimated accurately with our fusion method while a more conventional extrapolation method falls short. The method forms part of the prototype web-based service PESCaDO, designed to cater personalized environmental information to users. We introduce PESCaDO, a service designed to cater personalized environmental information.We present a novel data fusion method that merges data from different sources for a given area and time.With the presented fusion method the hourly concentration of air quality pollutants can be assessed in varying urban environments.We present a system called AirMerge, which converts image-based concentration maps into numerical data for fusion.

[1]  Kostas Karatzas,et al.  Informing the public about atmospheric quality: air pollution and pollen , 2009 .

[2]  Yiannis Kompatsiaris,et al.  Discovery of environmental resources based on heatmap recognition , 2013, 2013 IEEE International Conference on Image Processing.

[3]  Anastasios Bassoukos,et al.  A method for the inverse reconstruction of environmental data applicable at the Chemical Weather portal , 2010 .

[4]  Kostas D. Karatzas,et al.  Environmental Information Portals, Services, and Retrieval Systems , 2005, EnviroInfo.

[5]  Mark A. Liniger,et al.  Can multi‐model combination really enhance the prediction skill of probabilistic ensemble forecasts? , 2007 .

[6]  Luciano Serafini,et al.  An Ontological Framework for Decision Support , 2012, JIST.

[7]  Emanuele Pianta,et al.  KX: A Flexible System for Keyphrase eXtraction , 2010, *SEMEVAL.

[8]  Jaakko Kukkonen,et al.  Interactions of Physical, Chemical, and Biological Weather Calling for an Integrated Approach to Assessment, Forecasting, and Communication of Air Quality , 2012, AMBIO.

[9]  Jaakko Kukkonen,et al.  A review of operational, regional-scale, chemical weather forecasting models in Europe , 2012 .

[10]  Emanuele Pianta,et al.  Service-Based Infrastructure for User-Oriented Environmental Information Delivery , 2010 .

[11]  Lucien Wald,et al.  Some terms of reference in data fusion , 1999, IEEE Trans. Geosci. Remote. Sens..

[12]  Yiannis Kompatsiaris,et al.  Building an Environmental Information System for Personalized Content Delivery , 2011, ISESS.

[13]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[14]  Mike Rees,et al.  5. Statistics for Spatial Data , 1993 .

[15]  Toru Ishida,et al.  Domain-specific Web search with keyword spices , 2004, IEEE Transactions on Knowledge and Data Engineering.

[16]  Yiannis Kompatsiaris,et al.  Focussed crawling of environmental web resources: A pilot study on the combination of multimedia evidence , 2014, EMR@ICMR.

[17]  Clemens Mensink,et al.  Spatial interpolation of air pollution measurements using CORINE land cover data , 2008 .

[18]  Yiannis Kompatsiaris,et al.  Discovery of Environmental Nodes in the Web , 2012, IRFC.

[19]  Marco Rospocher,et al.  Ontology Management in a Service-Oriented Architecture: Architecture of a Knowledge Base Access Service , 2012, 2012 23rd International Workshop on Database and Expert Systems Applications.

[20]  David Hawking,et al.  Focused Crawling in Depression Portal Search: A Feasibility Study , 2004, ADCS.

[21]  Yiannis Kompatsiaris,et al.  An environmental search engine based on interactive visual classification , 2012, MAED '12.

[22]  Yiannis Kompatsiaris,et al.  Personalized Environmental Service Orchestration for Quality of Life Improvement , 2012, AIAI.

[23]  Emanuele Pianta,et al.  Boosting Collaborative Ontology Building with Key-Concept Extraction , 2011, 2011 IEEE Fifth International Conference on Semantic Computing.

[24]  D. Fesenmaier,et al.  Domain-specific search engines. , 2006 .

[25]  David Shooter,et al.  Qualitative analysis of organics in atmospheric particulates by headspace solid phase microextraction-GC/MS , 2004 .

[26]  Noel A. C. Cressie,et al.  Statistics for Spatial Data: Cressie/Statistics , 1993 .

[27]  J. Gulliver,et al.  A review of land-use regression models to assess spatial variation of outdoor air pollution , 2008 .

[28]  Jaakko Kukkonen,et al.  A New Environmental Image Processing Method for Chemical Weather Forecasts in Europe , 2011, ITEE.

[29]  G. Aghila,et al.  Ontology-based Web crawler , 2004, International Conference on Information Technology: Coding and Computing, 2004. Proceedings. ITCC 2004..

[30]  Stefano Galmarini,et al.  Est modus in rebus : analytical properties of multi-model ensembles , 2009 .

[31]  Thomas Sikora,et al.  The MPEG-7 visual standard for content description-an overview , 2001, IEEE Trans. Circuits Syst. Video Technol..

[32]  Jaakko Kukkonen,et al.  Evaluation of the Accuracy of an Inverse Image-Based Reconstruction Method for Chemical Weather Data , 2012 .

[33]  Jaakko Kukkonen,et al.  A European open access chemical weather forecasting portal , 2011 .

[34]  Yiannis Kompatsiaris,et al.  A model for environmental data extraction from multimedia and its evaluation against various chemical weather forecasting datasets , 2014, Ecol. Informatics.