Confident Classification Using a Hybrid Between Deterministic and Probabilistic Convolutional Neural Networks

Traditional neural networks trained using point-based maximum likelihood estimation are deterministic models and have exhibited near-human performance in many image classification tasks. However, their insistence on representing network parameters with point-estimates renders them incapable of capturing all possible combinations of the weights; consequently, resulting in a biased predictor towards their initialisation. Most importantly, these deterministic networks are inherently unable to provide any uncertainty estimate for their prediction which is highly sought after in many critical application areas. On the other hand, Bayesian neural networks place a probability distribution on network weights and give a built-in regularisation effect making these models able to learn well from small datasets without overfitting. These networks provide a way of generating posterior distribution which can be used for model’s uncertainty estimation. However, Bayesian estimation is computationally very expensive since it greatly widens the parameter space. This paper proposes a hybrid convolutional neural network which combines high accuracy of deterministic models with posterior distribution approximation of Bayesian neural networks. This hybrid architecture is validated on 13 publicly available benchmark classification datasets from a wide range of domains and different modalities like natural scene images, medical images, and time-series. Our results show that the proposed hybrid approach performs better than both deterministic and Bayesian methods in terms of classification accuracy and also provides an estimate of uncertainty for every prediction. We further employ this uncertainty to filter out unconfident predictions and achieve significant additional gain in accuracy for the remaining predictions.

[1]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[2]  Xiaomin Pei,et al.  Emphysema Classification Using Convolutional Neural Networks , 2015, ICIRA.

[3]  Kibok Lee,et al.  Training Confidence-calibrated Classifiers for Detecting Out-of-Distribution Samples , 2017, ICLR.

[4]  Kilian Q. Weinberger,et al.  On Calibration of Modern Neural Networks , 2017, ICML.

[5]  Ion Stoica,et al.  Population Based Augmentation: Efficient Learning of Augmentation Policy Schedules , 2019, ICML.

[6]  Myunghee Cho Paik,et al.  Uncertainty quantification using Bayesian neural networks in classification: Application to ischemic stroke lesion segmentation , 2018 .

[7]  Zoubin Ghahramani,et al.  Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[8]  Alex Kendall,et al.  What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision? , 2017, NIPS.

[9]  Sebastian Thrun,et al.  Dermatologist-level classification of skin cancer with deep neural networks , 2017, Nature.

[10]  Yoshua Bengio,et al.  NICE: Non-linear Independent Components Estimation , 2014, ICLR.

[11]  Quoc V. Le,et al.  GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism , 2018, ArXiv.

[12]  David M. Blei,et al.  Variational Inference: A Review for Statisticians , 2016, ArXiv.

[13]  Ullrich Köthe,et al.  Analyzing Inverse Problems with Invertible Neural Networks , 2018, ICLR.

[14]  Julien Cornebise,et al.  Weight Uncertainty in Neural Networks , 2015, ArXiv.

[15]  Iain Murray,et al.  Masked Autoregressive Flow for Density Estimation , 2017, NIPS.

[16]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Tien Yin Wong,et al.  ORIGA-light: An online retinal fundus image database for glaucoma analysis and research , 2010, 2010 Annual International Conference of the IEEE Engineering in Medicine and Biology.

[18]  Prafulla Dhariwal,et al.  Glow: Generative Flow with Invertible 1x1 Convolutions , 2018, NeurIPS.

[19]  Samy Bengio,et al.  Density estimation using Real NVP , 2016, ICLR.

[20]  Andreas Dengel,et al.  Two-stage framework for optic disc localization and glaucoma classification in retinal fundus images using deep learning , 2019, BMC Medical Informatics and Decision Making.

[21]  Antonio Torralba,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 80 Million Tiny Images: a Large Dataset for Non-parametric Object and Scene Recognition , 2022 .

[22]  Radford M. Neal Probabilistic Inference Using Markov Chain Monte Carlo Methods , 2011 .

[23]  Quoc V. Le,et al.  Efficient Neural Architecture Search via Parameter Sharing , 2018, ICML.

[24]  Frans Coenen,et al.  Convolutional Neural Networks for Diabetic Retinopathy , 2016, MIUA.

[25]  Kumar Shridhar,et al.  Uncertainty Estimations by Softplus normalization in Bayesian Convolutional Neural Networks with Variational Inference , 2018 .

[26]  Vijay Vasudevan,et al.  Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[27]  Andreas Dengel,et al.  DeepAnT: A Deep Learning Approach for Unsupervised Anomaly Detection in Time Series , 2019, IEEE Access.

[28]  Max Welling,et al.  Improved Variational Inference with Inverse Autoregressive Flow , 2016, NIPS 2016.

[29]  Prabhat,et al.  Scalable Bayesian Optimization Using Deep Neural Networks , 2015, ICML.

[30]  Heiga Zen,et al.  Parallel WaveNet: Fast High-Fidelity Speech Synthesis , 2017, ICML.

[31]  Geoffrey E. Hinton,et al.  Dynamic Routing Between Capsules , 2017, NIPS.

[32]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[33]  Yi Zheng,et al.  Time Series Classification Using Multi-Channels Deep Convolutional Neural Networks , 2014, WAIM.

[34]  H. Haenssle,et al.  Man against machine: diagnostic performance of a deep learning convolutional neural network for dermoscopic melanoma recognition in comparison to 58 dermatologists , 2018, Annals of oncology : official journal of the European Society for Medical Oncology.

[35]  Ruslan Salakhutdinov,et al.  Learning Stochastic Feedforward Neural Networks , 2013, NIPS.

[36]  Nicola De Cao,et al.  Block Neural Autoregressive Flow , 2019, UAI.

[37]  Alexandre Lacoste,et al.  Neural Autoregressive Flows , 2018, ICML.

[38]  Kevin Gimpel,et al.  A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks , 2016, ICLR.

[39]  Max Welling,et al.  Markov Chain Monte Carlo and Variational Inference: Bridging the Gap , 2014, ICML.

[40]  Kumar Shridhar,et al.  Bayesian Convolutional Neural Networks , 2018, ArXiv.

[41]  Aníbal R. Figueiras-Vidal,et al.  Marginalized Neural Network Mixtures for Large-Scale Regression , 2010, IEEE Transactions on Neural Networks.

[42]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[43]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.