论文信息 - Predictive Uncertainty Quantification with Compound Density Networks

Predictive Uncertainty Quantification with Compound Density Networks

Despite the huge success of deep neural networks (NNs), finding good mechanisms for quantifying their prediction uncertainty is still an open problem. Bayesian neural networks are one of the most popular approaches to uncertainty quantification. On the other hand, it was recently shown that ensembles of NNs, which belong to the class of mixture models, can be used to quantify prediction uncertainty. In this paper, we build upon these two approaches. First, we increase the mixture model's flexibility by replacing the fixed mixing weights by an adaptive, input-dependent distribution (specifying the probability of each component) represented by NNs, and by considering uncountably many mixture components. The resulting class of models can be seen as the continuous counterpart to mixture density networks and is therefore referred to as compound density networks (CDNs). We employ both maximum likelihood and variational Bayesian inference to train CDNs, and empirically show that they yield better uncertainty estimates on out-of-distribution data and are more robust to adversarial examples than the previous approaches.

Agustinus Kristiadi | Asja Fischer | Asja Fischer | Agustinus Kristiadi

[1] Quoc V. Le,et al. HyperNetworks , 2016, ICLR.

[2] A. Raftery,et al. Strictly Proper Scoring Rules, Prediction, and Estimation , 2007 .

[3] Ben Glocker,et al. Implicit Weight Uncertainty in Neural Networks. , 2017 .

[4] Yee Whye Teh,et al. Bayesian Learning via Stochastic Gradient Langevin Dynamics , 2011, ICML.

[5] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.

[6] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[8] Michael I. Jordan,et al. Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[9] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[10] Didrik Nielsen,et al. Fast and Scalable Bayesian Deep Learning by Weight-Perturbation in Adam , 2018, ICML.

[11] Luc Van Gool,et al. Dynamic Filter Networks , 2016, NIPS.

[12] Benjamin Van Roy,et al. Deep Exploration via Bootstrapped DQN , 2016, NIPS.

[13] Zoubin Ghahramani,et al. Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[14] Mark J. F. Gales,et al. Predictive Uncertainty Estimation via Prior Networks , 2018, NeurIPS.

[15] Ian J. Goodfellow,et al. Technical Report on the CleverHans v2.1.0 Adversarial Examples Library , 2016 .

[16] S. Srihari. Mixture Density Networks , 1994 .

[17] John Platt,et al. Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[18] Naftali Tishby,et al. The information bottleneck method , 2000, ArXiv.