论文信息 - Multi-Class Gaussian Process Classification Made Conjugate: Efficient Inference via Data Augmentation - 字舞流文

Multi-Class Gaussian Process Classification Made Conjugate: Efficient Inference via Data Augmentation

We propose a new scalable multi-class Gaussian process classification approach building on a novel modified softmax likelihood function. The new likelihood has two benefits: it leads to well-calibrated uncertainty estimates and allows for an efficient latent variable augmentation. The augmented model has the advantage that it is conditionally conjugate leading to a fast variational inference method via block coordinate ascent updates. Previous approaches suffered from a trade-off between uncertainty calibration and speed. Our experiments show that our method leads to well-calibrated uncertainty estimates and competitive predictive performance while being up to two orders faster than the state of the art.

Florian Wenzel | Manfred Opper | Christian Donner | Théo Galy-Fajou | M. Opper | F. Wenzel | Théo Galy-Fajou | Christian Donner

[1] Alexis Boukouvalas,et al. GPflow: A Gaussian Process Library using TensorFlow , 2016, J. Mach. Learn. Res..

[2] Manfred Opper,et al. Efficient Bayesian Inference for a Gaussian Process Density Model , 2018, UAI.

[3] Christopher K. I. Williams,et al. Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning) , 2005 .

[4] Xin Zhang,et al. End to End Learning for Self-Driving Cars , 2016, ArXiv.

[5] David M. Blei,et al. Augment and Reduce: Stochastic Inference for Large Categorical Distributions , 2018, ICML.

[6] James Hensman,et al. Scalable Variational Gaussian Process Classification , 2014, AISTATS.

[7] Michalis K. Titsias,et al. Variational Learning of Inducing Variables in Sparse Gaussian Processes , 2009, AISTATS.

[8] James G. Scott,et al. Bayesian Inference for Logistic Models Using Pólya–Gamma Latent Variables , 2012, 1205.0310.

[9] Johannes Gehrke,et al. Intelligible Models for HealthCare: Predicting Pneumonia Risk and Hospital 30-day Readmission , 2015, KDD.

[10] Mark Girolami,et al. Variational Bayesian Multinomial Probit Regression with Gaussian Process Priors , 2006, Neural Computation.

[11] David M. Blei,et al. A Variational Analysis of Stochastic Gradient Algorithms , 2016, ICML.

[12] Michalis Titsias Rc Aueb. One-vs-Each Approximation to Softmax for Scalable Estimation of Probabilities , 2016, NIPS 2016.

[13] Stephan Mandt,et al. Quasi-Monte Carlo Variational Inference , 2018, ICML.

[14] Milos Hauskrecht,et al. Obtaining Well Calibrated Probabilities Using Bayesian Binning , 2015, AAAI.

[15] Geoffrey Zweig,et al. Achieving Human Parity in Conversational Speech Recognition , 2016, ArXiv.

[16] Sergei Vassilvitskii,et al. k-means++: the advantages of careful seeding , 2007, SODA '07.

[17] Kilian Q. Weinberger,et al. On Calibration of Modern Neural Networks , 2017, ICML.

[18] James Hensman,et al. Natural Gradients in Practice: Non-Conjugate Variational Inference in Gaussian Process Models , 2018, AISTATS.

[19] Dmitry Kropotov,et al. Scalable Gaussian Processes with Billions of Inducing Inputs via Tensor Train Decomposition , 2017, AISTATS.

[20] S. Chib,et al. Bayesian analysis of binary and polychotomous response data , 1993 .

[21] Kian Ming Adam Chai,et al. Variational Multinomial Logit Gaussian Process , 2012, J. Mach. Learn. Res..

[22] Rok Češnovar,et al. Bayesian Lasso and multinomial logistic regression on GPU , 2017, PloS one.

[23] Neil D. Lawrence,et al. Gaussian Processes for Big Data , 2013, UAI.

[24] Marius Kloft,et al. Efficient Gaussian Process Classification Using Polya-Gamma Data Augmentation , 2018, AAAI.

[25] Aki Vehtari,et al. Nested expectation propagation for Gaussian process classification , 2013, J. Mach. Learn. Res..

[26] J. S. Maritz,et al. Empirical Bayes Methods with Applications , 1989 .

[27] Stephen G. Walker,et al. Posterior Sampling When the Normalizing Constant is Unknown , 2011, Commun. Stat. Simul. Comput..

[28] Hyun-Chul Kim,et al. Bayesian Gaussian Process Classification with the EM-EP Algorithm , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29] Daniel Hernández-Lobato,et al. Scalable Multi-Class Gaussian Process Classification using Expectation Propagation , 2017, ICML.

[30] Lu Liu,et al. Classification with ClassOverlapping: A Systematic Study , 2010, ICE-B 2010.

[31] Daniel Hernández-Lobato,et al. Robust Multi-Class Gaussian Process Classification , 2011, NIPS.

[32] James Hensman,et al. MCMC for Variationally Sparse Gaussian Processes , 2015, NIPS.

[33] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[34] David M. Blei,et al. Variational Inference: A Review for Statisticians , 2016, ArXiv.

[35] David Barber,et al. Bayesian Classification With Gaussian Processes , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[36] Chong Wang,et al. Stochastic variational inference , 2012, J. Mach. Learn. Res..

[37] Scott W. Linderman,et al. Dependent Multinomial Models Made Easy: Stick-Breaking with the Polya-gamma Augmentation , 2015, NIPS.

[38] Manfred Opper,et al. Inverse Ising problem in continuous time: A latent variable approach. , 2017, Physical review. E.

[39] Art B. Owen,et al. Monte Carlo extension of quasi-Monte Carlo , 1998, 1998 Winter Simulation Conference. Proceedings (Cat. No.98CH36274).