A variational EM approach to predictive uncertainty

In many applications of regression, the conditional average of the target variable is not sufficient for prediction. The dependencies between the explanatory variables and the target variable can be complex calling for modelling of the full conditional probability density. The ubiquitous problem with such methods is overfitting since due to the flexibility of the model the likelihood of any datapoint can be made arbitrarily large. In this paper a method for predicting uncertainty by modelling the conditional density is presented based on conditioning the scale parameter of the noise process on the explanatory variables. The model is constructed in such a manner that the unpredictability of the scale of the target distribution translates into a more robust predictive distribution. The overfitting problems are solved by learning the model using variational EM. The method is experimentally demonstrated with synthetic data as well as with real-world environmental data. The viability of the approach was put to test in the 'Predictive uncertainty in environmental modelling' competition held at WCCI'06. The proposed method won the competition.

[1]  P. M. Williams,et al.  Using Neural Networks to Model Conditional Multivariate Densities , 1996, Neural Computation.

[2]  Gavin C. Cawley,et al.  Modelling SO2 concentration at a point with statistical approaches , 2004, Environ. Model. Softw..

[3]  Geoffrey E. Hinton,et al.  Keeping the neural networks simple by minimizing the description length of the weights , 1993, COLT '93.

[4]  Geoffrey E. Hinton,et al.  A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.

[5]  Juha Karhunen,et al.  Hierarchical models of variance sources , 2004, Signal Process..

[6]  Michael I. Jordan Learning in Graphical Models , 1999, NATO ASI Series.

[7]  Antti Honkela,et al.  Unsupervised Variational Bayesian Learning of Nonlinear Models , 2004, NIPS.

[8]  Andreas S. Weigend,et al.  Predictions with Confidence Intervals ( Local Error Bars ) , 1994 .

[9]  Keming Yu,et al.  Quantile regression: applications and current research areas , 2003 .

[10]  David Barber,et al.  Ensemble Learning for Multi-Layer Networks , 1997, NIPS.

[11]  Gareth J. Janacek,et al.  Predictive Uncertainty in Environmental Modelling , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[12]  C. Bishop Mixture density networks , 1994 .

[13]  Adrian F. M. Smith,et al.  Sampling-Based Approaches to Calculating Marginal Densities , 1990 .

[14]  Michael I. Jordan,et al.  Regression with input-dependent noise: A Gaussian process treatment , 1998 .

[15]  Christopher M. Bishop,et al.  Regression with Input-Dependent Noise: A Bayesian Treatment , 1996, NIPS.

[16]  R. Koenker,et al.  Regression Quantiles , 2007 .

[17]  Shuichi Kurogi,et al.  Ensemble of Competitive Associative Nets and Multiple K-fold Cross-Validation for Estimating Predictive Uncertainty in Environmental Modelling , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[18]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .