Data Imputation and Robust Training with Gaussian Processes
暂无分享,去创建一个
When training a regression model from observations, it is often assumed that only the outputs are noisy. When the inputs are also known (or suspected) to be corrupted, the challenge is to account for this uncertainty properly. In all but the simplest of models, integrating out the inputs is intractable, even if the true input distribution were known. We present an approach for Gaussian Process machines which simultaneously accounts for data uncertainty, thus improving future predictions, and estimates the true values of the noisy inputs. Our algorithm is based on lower bounding the true marginal likelihood, and takes the form of an expectation-maximization procedure, alternately updating model parameters and adjusting estimates of cleaned input points.
[1] H. Toutenburg,et al. Rubin, D.B.: Multiple imputation for nonresponse in surveys , 1990 .
[2] Michael I. Jordan,et al. Supervised learning from incomplete data via an EM approach , 1993, NIPS.
[3] Volker Tresp,et al. Training Neural Networks with Deficient Data , 1993, NIPS.
[4] D. Ruppert,et al. Measurement Error in Nonlinear Models , 1995 .
[5] Geoffrey E. Hinton,et al. Evaluation of Gaussian processes and other methods for non-linear regression , 1997 .