Bayesian deep learning: A model-based interpretable approach

Deep learning is considered to be a model-free, end-to-end, and black-box approach. It requires numerous data samples instead of expert knowledge on the target domain. Hence, it does not specify the mechanism and reasons for its decision making. This aspect is considered a critical limitation of deep learning. This paper introduces another viewpoint, namely Bayesian deep learning. Deep learning can be installed in any framework, such as Bayesian networks and reinforcement learning. Subsequently, an expert can implement the knowledge as the graph structure, accelerate learning, and obtain new knowledge on the target domain. The framework is termed as the deep generative model. Conversely, we can directly introduce the Bayesian modeling approach to deep learning. Subsequently, it is possible to explore deep learning with respect to the confidence of its decision making via uncertainty quantification of the output and detect wrong decision-making or anomalous inputs. Given the aforementioned approaches, it is possible to adjust the “brightness” of deep learning.

[1]  Andreas M. Lehrmann,et al.  Non-parametric Structured Output Networks , 2017, NIPS.

[2]  Mark J. F. Gales,et al.  Incorporating Uncertainty into Deep Learning for Spoken Language Assessment , 2017, ACL.

[3]  Takashi Matsubara,et al.  Structured Deep Generative Model of fMRI Signals for Mental Disorder Diagnosis , 2018, MICCAI.

[4]  Takashi Matsubara,et al.  Stock Price Prediction by Deep Neural Generative Model of News Articles , 2018, IEICE Trans. Inf. Syst..

[5]  Razvan Pascanu,et al.  On the Number of Linear Regions of Deep Neural Networks , 2014, NIPS.

[6]  Takashi Matsubara,et al.  Deep Generative State-Space Modeling of FMRI Images for Psychiatric Disorder Diagnosis , 2019, 2019 International Joint Conference on Neural Networks (IJCNN).

[7]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[8]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[9]  Takashi Matsubara,et al.  RICAP: Random Image Cropping and Patching Data Augmentation for Deep CNNs , 2018, ACML.

[10]  Geoffrey E. Hinton,et al.  Deep Boltzmann Machines , 2009, AISTATS.

[11]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..

[12]  Uri Shalit,et al.  Structured Inference Networks for Nonlinear State Space Models , 2016, AAAI.

[13]  Joachim M. Buhmann,et al.  Wheel Defect Detection With Machine Learning , 2018, IEEE Transactions on Intelligent Transportation Systems.

[14]  Ruslan Salakhutdinov,et al.  Learning Stochastic Feedforward Neural Networks , 2013, NIPS.

[15]  A. Kiureghian,et al.  Aleatory or epistemic? Does it matter? , 2009 .

[16]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[17]  Takashi Matsubara,et al.  Deep Generative Model Using Unregularized Score for Anomaly Detection With Heterogeneous Complexity , 2018, IEEE Transactions on Cybernetics.

[18]  Kibok Lee,et al.  Training Confidence-calibrated Classifiers for Detecting Out-of-Distribution Samples , 2017, ICLR.

[19]  Stefano Ermon,et al.  Learning Hierarchical Features from Generative Models , 2017, ArXiv.

[20]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[21]  Takashi Matsubara,et al.  Predictable Uncertainty-Aware Unsupervised Deep Anomaly Segmentation , 2019, 2019 International Joint Conference on Neural Networks (IJCNN).

[22]  Stefano Ermon,et al.  Learning Hierarchical Features from Deep Generative Models , 2017, ICML.

[23]  ukai Hypernetwork-based Implicit Posterior Estimation and Model Averaging of Convolutional Neural Networks Kenya , 2018 .

[24]  Andrew Y. Ng,et al.  Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[25]  Pascal Vincent,et al.  Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[26]  Takashi Matsubara,et al.  Anomaly Machine Component Detection by Deep Generative Model with Unregularized Score , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).

[27]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[28]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[29]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[30]  Guido Montúfar,et al.  Deep Narrow Boltzmann Machines are Universal Approximators , 2014, ICLR.

[31]  Weisi Lin,et al.  Saliency-Based Defect Detection in Industrial Images by Using Phase Spectrum , 2014, IEEE Transactions on Industrial Informatics.

[32]  Tom Minka,et al.  Principled Hybrids of Generative and Discriminative Models , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[33]  Shai Shalev-Shwartz,et al.  SGD Learns Over-parameterized Networks that Provably Generalize on Linearly Separable Data , 2017, ICLR.

[34]  De Xu,et al.  Automatic Metallic Surface Defect Detection and Recognition with Convolutional Neural Networks , 2018, Applied Sciences.

[35]  Alex Kendall,et al.  What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision? , 2017, NIPS.

[36]  M Mitchell Waldrop,et al.  News Feature: What are the limits of deep learning? , 2019, Proceedings of the National Academy of Sciences.

[37]  Quoc V. Le,et al.  HyperNetworks , 2016, ICLR.

[38]  Koenraad Van Leemput,et al.  Automated segmentation of multiple sclerosis lesions by model outlier detection , 2001, IEEE Transactions on Medical Imaging.

[39]  Dmitry P. Vetrov,et al.  Uncertainty Estimation via Stochastic Batch Normalization , 2018, ICLR.

[40]  Ole Winther,et al.  Auxiliary Deep Generative Models , 2016, ICML.

[41]  Chao Lan,et al.  Anomaly Detection , 2018, Encyclopedia of GIS.

[42]  Hongyi Zhang,et al.  mixup: Beyond Empirical Risk Minimization , 2017, ICLR.

[43]  Takashi Matsubara,et al.  Deep Neural Generative Model of Functional MRI Images for Psychiatric Disorder Diagnosis , 2017, IEEE Transactions on Biomedical Engineering.

[44]  Tafsir Thiam,et al.  The Boltzmann machine , 1999, IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339).

[45]  Dewen Hu,et al.  Discriminative analysis of resting-state functional connectivity patterns of schizophrenia using low dimensional embedding of fMRI , 2010, NeuroImage.

[46]  Navdeep Jaitly,et al.  Adversarial Autoencoders , 2015, ArXiv.

[47]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[48]  David M. Blei,et al.  The Generalized Reparameterization Gradient , 2016, NIPS.

[49]  Karol Gregor,et al.  Neural Variational Inference and Learning in Belief Networks , 2014, ICML.

[50]  Yoshua Bengio,et al.  Deep Generative Stochastic Networks Trainable by Backprop , 2013, ICML.

[51]  B. Biswal,et al.  Functional connectivity in the motor cortex of resting human brain using echo‐planar mri , 1995, Magnetic resonance in medicine.

[52]  Jianfei Cai,et al.  Exploring Uncertainty Measures for Image-caption Embedding-and-retrieval Task , 2019, ACM Trans. Multim. Comput. Commun. Appl..

[53]  Kilian Q. Weinberger,et al.  On Calibration of Modern Neural Networks , 2017, ICML.

[54]  Ben Poole,et al.  Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.

[55]  Max Welling,et al.  Improved Variational Inference with Inverse Autoregressive Flow , 2016, NIPS 2016.

[56]  Lei Ai,et al.  A large, open source dataset of stroke anatomical brain images and manual lesion segmentations , 2017, Scientific Data.

[57]  Yarin Gal,et al.  Understanding Measures of Uncertainty for Adversarial Example Detection , 2018, UAI.

[58]  Liwei Wang,et al.  The Expressive Power of Neural Networks: A View from the Width , 2017, NIPS.

[59]  Amos J. Storkey,et al.  Towards a Neural Statistician , 2016, ICLR.

[60]  Quoc V. Le,et al.  DropBlock: A regularization method for convolutional networks , 2018, NeurIPS.

[61]  Christopher Burgess,et al.  beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework , 2016, ICLR 2016.

[62]  Zoubin Ghahramani,et al.  Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[63]  Sanjay Chawla,et al.  Robust, Deep and Inductive Anomaly Detection , 2017, ECML/PKDD.

[64]  Matthias Hein,et al.  Why ReLU Networks Yield High-Confidence Predictions Far Away From the Training Data and How to Mitigate the Problem , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[65]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[66]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[67]  Alexandre Lacoste,et al.  Neural Autoregressive Flows , 2018, ICML.

[68]  Pradeep Ravikumar,et al.  On Separability of Loss Functions, and Revisiting Discriminative Vs Generative Models , 2017, NIPS.

[69]  Francesco Cricri,et al.  Clustering and Unsupervised Anomaly Detection with l2 Normalized Deep Auto-Encoder Representations , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).

[70]  David Berthelot,et al.  MixMatch: A Holistic Approach to Semi-Supervised Learning , 2019, NeurIPS.

[71]  Barnabás Póczos,et al.  Gradient Descent Provably Optimizes Over-parameterized Neural Networks , 2018, ICLR.

[72]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[73]  Eric Jang,et al.  Generative Ensembles for Robust Anomaly Detection , 2018, ArXiv.

[74]  Mahmood Fathy,et al.  Adversarially Learned One-Class Classifier for Novelty Detection , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[75]  Kenji Leibnitz,et al.  Classification and characterisation of brain network changes in chronic back pain: A multicenter study , 2017, bioRxiv.

[76]  David M. Blei,et al.  Variational Inference: A Review for Statisticians , 2016, ArXiv.

[77]  Takeo Watanabe,et al.  A small number of abnormal brain connections predicts adult autism spectrum disorder , 2016, Nature Communications.

[78]  Shin Ishii,et al.  Distributional Smoothing with Virtual Adversarial Training , 2015, ICLR 2016.

[79]  Randy C. Paffenroth,et al.  Anomaly Detection with Robust Deep Autoencoders , 2017, KDD.

[80]  Sebastian Nowozin,et al.  Multi-Level Variational Autoencoder: Learning Disentangled Representations from Grouped Observations , 2017, AAAI.

[81]  Kuniaki Uehara,et al.  Bayesian estimation and model averaging of convolutional neural networks by hypernetwork , 2019, Nonlinear Theory and Its Applications, IEICE.

[82]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[83]  Roberto Cipolla,et al.  Bayesian SegNet: Model Uncertainty in Deep Convolutional Encoder-Decoder Architectures for Scene Understanding , 2015, BMVC.

[84]  Michael I. Jordan,et al.  Mean Field Theory for Sigmoid Belief Networks , 1996, J. Artif. Intell. Res..

[85]  Max Welling,et al.  Semi-supervised Learning with Deep Generative Models , 2014, NIPS.

[86]  Georg Langs,et al.  Unsupervised Anomaly Detection with Generative Adversarial Networks to Guide Marker Discovery , 2017, IPMI.

[87]  Mark J. F. Gales,et al.  Predictive Uncertainty Estimation via Prior Networks , 2018, NeurIPS.

[88]  Nicha C. Dvornek,et al.  Identifying Autism from Resting-State fMRI Using Long Short-Term Memory Networks , 2017, MLMI@MICCAI.

[89]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[90]  Yuanzhi Li,et al.  Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data , 2018, NeurIPS.

[91]  Pascal Vincent,et al.  Generalized Denoising Auto-Encoders as Generative Models , 2013, NIPS.

[92]  Takashi Matsubara,et al.  Hypernetwork-based Implicit Posterior Estimation and Model Averaging of CNN , 2018, ACML.

[93]  Dinggang Shen,et al.  State-space model with deep learning for functional dynamics estimation in resting-state fMRI , 2016, NeuroImage.

[94]  S. Srihari Mixture Density Networks , 1994 .