Asymmetric kernel functions in non-parametric regression analysis and prediction

Non-parametric kernel and nearest neighbour estimates represent flexible alternatives to parametric modelling. For non-constant densities of the explanatory variables, kernel estimates are usually biased for finite samples, even in the case of linear regression functions. This can be seen by looking at the asymptotic expression of the bias of kernel regression estimates derived under certain mixing conditions. In this paper bias reduction techniques using asymmetric kernel functions are suggested. In contrast to well-designed experiments, in environmental data analysis the positions of the design points are not under control, and therefore the measurements of the explanatory variables are arbitrarily scattered over the factor space in an unbalanced way. In this case the asymmetric kernel techniques outperform the usual symmetric kernel methods as demonstrated by an application of both methods to the relationship between the oxygen concentration and the temperature of river water. Furthermore the new methods lead to better predictions in an autoregressive time series model for air pollution measurements such as nitrogen dioxide and sulphur dioxide concentrations.