论文信息 - Infinitely Deep Bayesian Neural Networks with Stochastic Differential Equations - 字舞流文

Infinitely Deep Bayesian Neural Networks with Stochastic Differential Equations

We perform scalable approximate inference in a recently-proposed family of continuous-depth Bayesian neural networks. In this model class, uncertainty about separate weights in each layer produces dynamics that follow a stochastic differential equation (SDE). We demonstrate gradientbased stochastic variational inference in this infinite-parameter setting, producing arbitrarilyflexible approximate posteriors. We also derive a novel gradient estimator that approaches zero variance as the approximate posterior approaches the true posterior. This approach further inherits the memory-efficient training and tunable precision of neural ODEs.

David Duvenaud | Ricky T. Q. Chen | Winnie Xu | Xuechen Li | Ricky T.Q. Chen | D. Duvenaud | Xuechen Li | Winnie Xu

[1] David Duvenaud,et al. Scalable Gradients for Stochastic Differential Equations , 2020, AISTATS.

[2] Samy Bengio,et al. Density estimation using Real NVP , 2016, ICLR.

[3] David Duvenaud,et al. Sticking the Landing: Simple, Lower-Variance Gradient Estimators for Variational Inference , 2017, NIPS.

[4] Michael W. Dusenberry,et al. Efficient and Scalable Bayesian Neural Nets with Rank-1 Factors , 2020, ICML.

[5] Maxim Raginsky,et al. Neural Stochastic Differential Equations: Deep Latent Gaussian Models in the Diffusion Limit , 2019, ArXiv.

[6] David Duvenaud,et al. Latent ODEs for Irregularly-Sampled Time Series , 2019, ArXiv.

[7] P. Dupuis,et al. A variational representation for certain functionals of Brownian motion , 1998 .

[8] Yee Whye Teh,et al. Augmented Neural ODEs , 2019, NeurIPS.

[9] Terry Lyons,et al. Neural Controlled Differential Equations for Irregular Time Series , 2020, NeurIPS.

[10] Eldad Haber,et al. Stable architectures for deep neural networks , 2017, ArXiv.

[11] Matthew J. Johnson,et al. Learning Differential Equations that are Easy to Solve , 2020, NeurIPS.

[12] David Duvenaud,et al. Neural Ordinary Differential Equations , 2018, NeurIPS.

[13] L. Deng,et al. The MNIST Database of Handwritten Digit Images for Machine Learning Research [Best of the Web] , 2012, IEEE Signal Processing Magazine.

[14] Differential Bayesian Neural Nets , 2019, ArXiv.

[15] Daan Wierstra,et al. Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[16] Han Zhang,et al. Approximation Capabilities of Neural Ordinary Differential Equations , 2019, ArXiv.

[17] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[18] Alexandre Lacoste,et al. Bayesian Hypernetworks , 2017, ArXiv.

[19] Bastiaan S. Veeling,et al. How Good is the Bayes Posterior in Deep Neural Networks Really? , 2020, ICML.

[20] Guodong Zhang,et al. Noisy Natural Gradient as Variational Inference , 2017, ICML.

[21] D. Vetrov,et al. Stochasticity in Neural ODEs: An Empirical Study , 2020, ICLR 2020.

[22] Quoc V. Le,et al. HyperNetworks , 2016, ICLR.

[23] Alex Graves,et al. Practical Variational Inference for Neural Networks , 2011, NIPS.

[24] Thomas G. Dietterich,et al. Benchmarking Neural Network Robustness to Common Corruptions and Perturbations , 2018, ICLR.

[25] Jimeng Sun,et al. SDE-Net: Equipping Deep Neural Networks with Uncertainty Estimates , 2020, ICML.

[26] Hod Lipson,et al. Principled Weight Initialization for Hypernetworks , 2020, ICLR.

[27] S. Favaro,et al. Infinitely deep neural networks as diffusion processes , 2019, AISTATS.

[28] Abhishek Kumar,et al. Score-Based Generative Modeling through Stochastic Differential Equations , 2020, ICLR.

[29] Edward De Brouwer,et al. GRU-ODE-Bayes: Continuous modeling of sporadically-observed time series , 2019, NeurIPS.

[30] Richard E. Turner,et al. Practical Deep Learning with Bayesian Principles , 2019, Neural Information Processing Systems.

[31] Stefano Peluchetti,et al. Doubly infinite residual networks: a diffusion process approach , 2020, ArXiv.

[32] Tengyu Ma,et al. Fixup Initialization: Residual Learning Without Normalization , 2019, ICLR.

[33] Cho-Jui Hsieh,et al. Neural SDE: Stabilizing Neural ODE Networks with Stochastic Noise , 2019, ArXiv.

[34] R. Dandekar,et al. Bayesian Neural Ordinary Differential Equations , 2020, ArXiv.

[35] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[36] Dan Cornford,et al. Variational Inference for Diffusion Processes , 2007, NIPS.

[37] M. Opper. Variational Inference for Stochastic Differential Equations , 2019, Annalen der Physik.

[38] Alan Edelman,et al. A Differentiable Programming System to Bridge Machine Learning and Scientific Computing , 2019, ArXiv.

[39] Max Welling,et al. Multiplicative Normalizing Flows for Variational Bayesian Neural Networks , 2017, ICML.

[40] Ryan P. Adams,et al. Avoiding pathologies in very deep networks , 2014, AISTATS.

[41] Chong Wang,et al. Stochastic variational inference , 2012, J. Mach. Learn. Res..

[42] Bianca Zadrozny,et al. Transforming classifier scores into accurate multiclass probability estimates , 2002, KDD.

[43] Masashi Sugiyama,et al. Bayesian Dark Knowledge , 2015 .

[44] Richard S. Zemel,et al. Adversarial Distillation of Bayesian Neural Network Posteriors , 2018, ICML.

[45] Markus Heinonen,et al. ODE$^2$VAE: Deep generative second order ODEs with Bayesian neural networks , 2019, NeurIPS.

[46] A. Stephen McGough,et al. Black-Box Variational Inference for Stochastic Differential Equations , 2018, ICML.

[47] Samuel Kaski,et al. Deep learning with differential Gaussian process flows , 2018, AISTATS.

[48] Kilian Q. Weinberger,et al. On Calibration of Modern Neural Networks , 2017, ICML.

[49] Julien Cornebise,et al. Weight Uncertainty in Neural Networks , 2015, ArXiv.

[50] Geoffrey E. Hinton,et al. Bayesian Learning for Neural Networks , 1995 .

[51] Aaron Mishkin,et al. SLANG: Fast Structured Covariance Approximations for Bayesian Deep Learning with Natural Gradient , 2018, NeurIPS.

[52] E Weinan,et al. A Proposal on Machine Learning via Dynamical Systems , 2017, Communications in Mathematics and Statistics.

[53] Markus Heinonen,et al. ODE2VAE: Deep generative second order ODEs with Bayesian neural networks , 2019, NeurIPS.

[54] Kurt Keutzer,et al. ANODEV2: A Coupled Neural ODE Evolution Framework , 2019, ArXiv.

[55] Maxim Raginsky,et al. Theoretical guarantees for sampling and inference in generative models with latent diffusions , 2019, COLT.