Environmental and Pollution Data Mapping with Support Vector Regression

The present work deals with the first application of Support Vector Regression (SVR) for the spatial data mapping. SVR is a recent development of the Statistical Learning Theory (VapnikChervonenkis theory). It is based on Structural Risk Minimisation and seems to be promising approach for the spatial data analysis and processing. There are several attractive properties of the SVR: robustness of the solution which is important in many applications, sparseness of the regression, automatic control of the solutions complexity, good generalisation. In the present work results using SVR for the real data of soil contamination by Chernobyl radionuclides are presented. By tuning SVR hyper-parameters it was possible to cover the range of spatial function regression from overfitting to oversmoothing. Geostatistical tools structural analysis (variography) were used both for the exploratory raw data analysis and for understanding and interpretation of the SVR results. Variography was used to control performance of the SVR and to tune hyper-parameters as well. Report is based on a scientific collaboration between INSA Rouen, IDIAP, UNI Lausanne and IBRAE (Moscow) within the framework of INTAS grant on Environmental Data Mining.