Deep Ensemble Bayesian Active Learning : Addressing the Mode Collapse issue in Monte Carlo dropout via Ensembles

In image classification tasks, the ability of deep CNNs to deal with complex image data has proven to be unrivalled. However, they require large amounts of labeled training data to reach their full potential. In specialised domains such as healthcare, labeled data can be difficult and expensive to obtain. Active Learning aims to alleviate this problem, by reducing the amount of labelled data needed for a specific task while delivering satisfactory performance. We propose DEBAL, a new active learning strategy designed for deep neural networks. This method improves upon the current state-of-the-art deep Bayesian active learning method, which suffers from the mode collapse problem. We correct for this deficiency by making use of the expressive power and statistical properties of model ensembles. Our proposed method manages to capture superior data uncertainty, which translates into improved classification performance. We demonstrate empirically that our ensemble method yields faster convergence of CNNs trained on the MNIST and CIFAR-10 datasets.

[1]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[2]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[3]  Yarin Gal,et al.  Understanding Measures of Uncertainty for Adversarial Example Detection , 2018, UAI.

[4]  Silvio Savarese,et al.  Active Learning for Convolutional Neural Networks: A Core-Set Approach , 2017, ICLR.

[5]  Michael I. Jordan,et al.  Variational Bayesian Inference with Stochastic Search , 2012, ICML.

[6]  Siegfried Wahl,et al.  Leveraging uncertainty information from deep neural networks for disease detection , 2016, Scientific Reports.

[7]  Andreas Nürnberger,et al.  The Power of Ensembles for Active Learning in Image Classification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[8]  Adrian E. Roitberg,et al.  Less is more: sampling chemical space with active learning , 2018, The Journal of chemical physics.

[9]  Hoon Kim,et al.  Monte Carlo Statistical Methods , 2000, Technometrics.

[10]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[11]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[12]  Charles A. Sutton,et al.  VEEGAN: Reducing Mode Collapse in GANs using Implicit Variational Learning , 2017, NIPS.

[13]  H. Haenssle,et al.  Man against machine: diagnostic performance of a deep learning convolutional neural network for dermoscopic melanoma recognition in comparison to 58 dermatologists , 2018, Annals of oncology : official journal of the European Society for Medical Oncology.

[14]  Linton G. Freeman,et al.  Elementary Applied Statistics for students in Behavioral Science , 1965 .

[15]  Geraint Rees,et al.  Clinically applicable deep learning for diagnosis and referral in retinal disease , 2018, Nature Medicine.

[16]  Zoubin Ghahramani,et al.  Deep Bayesian Active Learning with Image Data , 2017, ICML.

[17]  Anima Anandkumar,et al.  Deep Active Learning for Named Entity Recognition , 2017, Rep4NLP@ACL.

[18]  Rong Jin,et al.  Batch mode active learning and its application to medical image classification , 2006, ICML.

[19]  José Miguel Hernández-Lobato,et al.  Uncertainty Decomposition in Bayesian Neural Networks with Latent Variables , 2017, 1706.08495.

[20]  David M. Blei,et al.  Variational Inference: A Review for Statisticians , 2016, ArXiv.

[21]  David Cohn,et al.  Active Learning , 2010, Encyclopedia of Machine Learning.

[22]  Yoshua Bengio,et al.  Greedy Layer-Wise Training of Deep Networks , 2006, NIPS.

[23]  Sanjoy Dasgupta,et al.  Analysis of a greedy active learning strategy , 2004, NIPS.

[24]  Xiao Lin,et al.  Active Learning for Visual Question Answering: An Empirical Study , 2017, ArXiv.

[25]  G. Brier VERIFICATION OF FORECASTS EXPRESSED IN TERMS OF PROBABILITY , 1950 .

[26]  Chong Wang,et al.  Stochastic variational inference , 2012, J. Mach. Learn. Res..

[27]  Guangtao Zhai,et al.  A Deep Learning-Based Radiomics Model for Prediction of Survival in Glioblastoma Multiforme , 2017, Scientific Reports.

[28]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[29]  Zoubin Ghahramani,et al.  Bayesian Active Learning for Classification and Preference Learning , 2011, ArXiv.

[30]  Ye Zhang,et al.  Active Discriminative Text Representation Learning , 2016, AAAI.

[31]  Charles M. Bishop,et al.  Ensemble learning in Bayesian neural networks , 1998 .

[32]  Ruimao Zhang,et al.  Cost-Effective Active Learning for Deep Image Classification , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[33]  David A. Cohn,et al.  Active Learning with Statistical Models , 1996, NIPS.

[34]  Zoubin Ghahramani,et al.  Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[35]  Jieping Ye,et al.  Querying discriminative and representative samples for batch mode active learning , 2013, KDD.

[36]  Charles Blundell,et al.  Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles , 2016, NIPS.

[37]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[38]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[39]  Roberto Cipolla,et al.  Bayesian SegNet: Model Uncertainty in Deep Convolutional Encoder-Decoder Architectures for Scene Understanding , 2015, BMVC.

[40]  Geoffrey E. Hinton,et al.  Keeping the neural networks simple by minimizing the description length of the weights , 1993, COLT '93.

[41]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[42]  Joachim Denzler,et al.  Active and Continuous Exploration with Deep Neural Networks and Expected Model Output Changes , 2016, ArXiv.

[43]  Frédéric Precioso,et al.  Adversarial Active Learning for Deep Networks: a Margin Based Approach , 2018, ArXiv.

[44]  Yarin Gal,et al.  Uncertainty in Deep Learning , 2016 .