论文信息 - Data Imputation and Robust Training with Gaussian Processes

Data Imputation and Robust Training with Gaussian Processes

When training a regression model from observations, it is often assumed that only the outputs are noisy. When the inputs are also known (or suspected) to be corrupted, the challenge is to account for this uncertainty properly. In all but the simplest of models, integrating out the inputs is intractable, even if the true input distribution were known. We present an approach for Gaussian Process machines which simultaneously accounts for data uncertainty, thus improving future predictions, and estimates the true values of the noisy inputs. Our algorithm is based on lower bounding the true marginal likelihood, and takes the form of an expectation-maximization procedure, alternately updating model parameters and adjusting estimates of cleaned input points.

Joaquin Quiñonero-Candela

[1] H. Toutenburg,et al. Rubin, D.B.: Multiple imputation for nonresponse in surveys , 1990 .

[2] Michael I. Jordan,et al. Supervised learning from incomplete data via an EM approach , 1993, NIPS.

[3] Volker Tresp,et al. Training Neural Networks with Deficient Data , 1993, NIPS.

[4] D. Ruppert,et al. Measurement Error in Nonlinear Models , 1995 .

[5] Geoffrey E. Hinton,et al. Evaluation of Gaussian processes and other methods for non-linear regression , 1997 .