论文信息 - Assessing Deep Neural Networks as Probability Estimators

Assessing Deep Neural Networks as Probability Estimators

Deep Neural Networks (DNNs) have performed admirably in classification tasks. However, the characterization of their classification uncertainties, required for certain applications, has been lacking. In this work, we investigate the issue by assessing DNNs’ ability to estimate conditional probabilities and propose a framework for systematic uncertainty characterization. Denoting the input sample as x and the category as y, the classification task of assigning a category y to a given input x can be reduced to the task of estimating the conditional probabilities p(y|x), as approximated by the DNN at its last layer using the softmax function. Since softmax yields a vector whose elements all fall in the interval (0, 1) and sum to 1, it suggests a probabilistic interpretation to the DNN’s outcome. Using synthetic and real-world datasets, we look into the impact of various factors, e.g., probability density f(x) and inter-categorical sparsity, on the precision of DNNs’ estimations of p(y|x), and find that the likelihood probability density and the inter-categorical sparsity have greater impacts than the prior probability to DNNs’ classification uncertainty.

[1] Max Welling,et al. Improved Variational Inference with Inverse Autoregressive Flow , 2016, NIPS 2016.

[2] Donald F. Specht,et al. Probabilistic neural networks , 1990, Neural Networks.

[3] Ryan P. Adams,et al. Probabilistic Backpropagation for Scalable Learning of Bayesian Neural Networks , 2015, ICML.

[4] Kilian Q. Weinberger,et al. On Calibration of Modern Neural Networks , 2017, ICML.

[5] Aaron Klein,et al. Bayesian Optimization with Robust Bayesian Neural Networks , 2016, NIPS.

[6] S. Srihari. Mixture Density Networks , 1994 .

[7] Simon Osindero,et al. Conditional Generative Adversarial Nets , 2014, ArXiv.

[8] Jianfeng Lu,et al. A Universal Approximation Theorem of Deep Neural Networks for Expressing Distributions , 2020, ArXiv.

[9] Stefano Soatto,et al. Emergence of Invariance and Disentanglement in Deep Representations , 2017, 2018 Information Theory and Applications Workshop (ITA).

[10] Amir F. Atiya,et al. Neural Networks for Density Estimation , 1998, NIPS.

[11] Charles J. Geyer,et al. Practical Markov Chain Monte Carlo , 1992 .

[12] 拓海杉山,et al. “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[13] Daan Wierstra,et al. Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[14] Lars Hertel,et al. Approximate Inference for Deep Latent Gaussian Mixtures , 2016 .

[15] Jeff Donahue,et al. Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.

[16] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .

[17] Dimitris N. Metaxas,et al. StackGAN: Text to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[18] Timo Aila,et al. A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.

[20] Wray L. Buntine,et al. Hands-On Bayesian Neural Networks—A Tutorial for Deep Learning Users , 2020, IEEE Computational Intelligence Magazine.

[21] Han Zhang,et al. Self-Attention Generative Adversarial Networks , 2018, ICML.

[22] Alexei A. Efros,et al. Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23] D. Mackay,et al. Bayesian neural networks and density networks , 1995 .

[24] Léon Bottou,et al. Wasserstein Generative Adversarial Networks , 2017, ICML.

[25] Soumith Chintala,et al. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[26] Scott T. Rickard,et al. Comparing Measures of Sparsity , 2008, IEEE Transactions on Information Theory.