How Good is the Bayes Posterior in Deep Neural Networks Really?
暂无分享,去创建一个
Sebastian Nowozin | Stephan Mandt | Jasper Snoek | Tim Salimans | Florian Wenzel | Bastiaan S. Veeling | Rodolphe Jenatton | Jakub Swiatkowski | Linh Tran | Kevin Roth | Tim Salimans | F. Wenzel | Kevin Roth | J. Swiatkowski | L. Tran | S. Mandt | Jasper Snoek | Rodolphe Jenatton | S. Nowozin
[1] G. Brier. VERIFICATION OF FORECASTS EXPRESSED IN TERMS OF PROBABILITY , 1950 .
[2] H. L. Gray,et al. On Bias Reduction in Estimation , 1971 .
[3] Wang,et al. Replica Monte Carlo simulation of spin glasses. , 1986, Physical review letters.
[4] S. Duane,et al. Hybrid Monte Carlo , 1987 .
[5] J. Berger. Statistical Decision Theory and Bayesian Analysis , 1988 .
[6] D. Rubin,et al. Inference from Iterative Simulation Using Multiple Sequences , 1992 .
[7] Geoffrey E. Hinton,et al. Keeping the neural networks simple by minimizing the description length of the weights , 1993, COLT '93.
[8] David R. Wolf,et al. Estimating functions of probability distributions from a finite set of samples. , 1994, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.
[9] Geoffrey E. Hinton,et al. Bayesian Learning for Neural Networks , 1995 .
[10] F. Komaki. On asymptotic properties of predictive distributions , 1996 .
[11] Y. Sugita,et al. Replica-exchange molecular dynamics method for protein folding , 1999 .
[12] William Bialek,et al. Entropy and Inference, Revisited , 2001, NIPS.
[13] Galin L. Jones. On the Markov chain central limit theorem , 2004, math/0409112.
[14] Michael W Deem,et al. Parallel tempering: theory, applications, and new perspectives. , 2005, Physical chemistry chemical physics : PCCP.
[15] Tadayoshi Fushiki. Bootstrap prediction and Bayesian prediction under misspecified models , 2005 .
[16] Olle Häggström,et al. On Variance Conditions for Markov Chain CLTs , 2007 .
[17] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[18] John K Kruschke,et al. Bayesian data analysis. , 2010, Wiley interdisciplinary reviews. Cognitive science.
[19] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[20] Andrew Gelman,et al. Handbook of Markov Chain Monte Carlo , 2011 .
[21] Radford M. Neal. MCMC Using Hamiltonian Dynamics , 2011, 1206.1901.
[22] Yee Whye Teh,et al. Bayesian Learning via Stochastic Gradient Langevin Dynamics , 2011, ICML.
[23] Kevin P. Murphy,et al. Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.
[24] Lennard Jansen,et al. Robust Bayesian inference under model misspecification , 2013 .
[25] Geoffrey E. Hinton,et al. On the importance of initialization and momentum in deep learning , 2013, ICML.
[26] R. Ramamoorthi,et al. On Posterior Concentration in Misspecified Models , 2013, 1312.4620.
[27] M. Betancourt,et al. Hamiltonian Monte Carlo for Hierarchical Models , 2013, 1312.0906.
[28] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[29] Ryan Babbush,et al. Bayesian Sampling Using Stochastic Gradient Thermostats , 2014, NIPS.
[30] Tianqi Chen,et al. Stochastic Gradient Hamiltonian Monte Carlo , 2014, ICML.
[31] Andrew Gelman,et al. The No-U-turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo , 2011, J. Mach. Learn. Res..
[32] Thijs van Ommen,et al. Inconsistency of Bayesian Inference for Misspecified Linear Models, and a Proposal for Repairing It , 2014, 1412.3730.
[33] Max Welling,et al. Variational Dropout and the Local Reparameterization Trick , 2015, NIPS 2015.
[34] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[35] Tianqi Chen,et al. A Complete Recipe for Stochastic Gradient MCMC , 2015, NIPS.
[36] Julien Cornebise,et al. Weight Uncertainty in Neural Network , 2015, ICML.
[37] Milos Hauskrecht,et al. Obtaining Well Calibrated Probabilities Using Bayesian Binning , 2015, AAAI.
[38] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[39] Julien Cornebise,et al. Weight Uncertainty in Neural Networks , 2015, ArXiv.
[40] Zhanxing Zhu,et al. Covariance-Controlled Adaptive Langevin Thermostat for Large-Scale Bayesian Sampling , 2015, NIPS.
[41] Lawrence Carin,et al. On the Convergence of Stochastic Gradient MCMC Algorithms with High-Order Integrators , 2015, NIPS.
[42] Lawrence Carin,et al. Preconditioned Stochastic Gradient Langevin Dynamics for Deep Neural Networks , 2015, AAAI.
[43] Lili Zhao,et al. Current Trends in Bayesian Methodology with Applications , 2016 .
[44] Farhan Abrol,et al. Variational Tempering , 2016, AISTATS.
[45] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[46] Yuan Yu,et al. TensorFlow: A system for large-scale machine learning , 2016, OSDI.
[47] Alexandre Lacoste,et al. PAC-Bayesian Theory Meets Bayesian Inference , 2016, NIPS.
[48] Zoubin Ghahramani,et al. Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.
[49] Luis Perez,et al. The Effectiveness of Data Augmentation in Image Classification using Deep Learning , 2017, ArXiv.
[50] Bohyung Han,et al. Regularizing Deep Neural Networks by Noise: Its Interpretation and Optimization , 2017, NIPS.
[51] Surya Ganguli,et al. Deep Information Propagation , 2016, ICLR.
[52] Daniel Flam-Shepherd. Mapping Gaussian Process Priors to Bayesian Neural Networks , 2017 .
[53] Charles Blundell,et al. Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles , 2016, NIPS.
[54] Dustin Tran,et al. TensorFlow Distributions , 2017, ArXiv.
[55] David M. Blei,et al. Stochastic Gradient Descent as Approximate Bayesian Inference , 2017, J. Mach. Learn. Res..
[56] Carlo Luschi,et al. Revisiting Small Batch Training for Deep Neural Networks , 2018, ArXiv.
[57] J. Liao,et al. Sharpening Jensen's Inequality , 2017, The American Statistician.
[58] Dmitry P. Vetrov,et al. Uncertainty Estimation via Stochastic Batch Normalization , 2018, ICLR.
[59] Jaehoon Lee,et al. Deep Neural Networks as Gaussian Processes , 2017, ICLR.
[60] Guodong Zhang,et al. Noisy Natural Gradient as Variational Inference , 2017, ICML.
[61] Sebastian Nowozin,et al. Debiasing Evidence Approximations: On Importance-weighted Autoencoders and Jackknife Variational Inference , 2018, ICLR.
[62] David M. Blei,et al. Noisin: Unbiased Regularization for Recurrent Neural Networks , 2018, ICML.
[63] Arnaud Doucet,et al. On the Selection of Initialization and Activation Function for Deep Neural Networks , 2018, ArXiv.
[64] Boris Flach,et al. Stochastic Normalizations as Bayesian Learning , 2018, ACCV.
[65] Guodong Zhang,et al. Eigenvalue Corrected Noisy Natural Gradient , 2018, ArXiv.
[66] Richard E. Turner,et al. Gaussian Process Behaviour in Wide Deep Neural Networks , 2018, ICLR.
[67] B. Leimkuhler,et al. Partitioned integrators for thermodynamic parameterization of neural networks , 2019, Foundations of Data Science.
[68] Benedict J. Leimkuhler,et al. TATi-Thermodynamic Analytics ToolkIt: TensorFlow-based software for posterior sampling in machine learning applications , 2019, ArXiv.
[69] Zhanxing Zhu,et al. The Anisotropic Noise in Stochastic Gradient Descent: Its Behavior of Escaping from Sharp Minima and Regularization Effects , 2018, ICML.
[70] Sebastian Nowozin,et al. Can You Trust Your Model's Uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift , 2019, NeurIPS.
[71] T. Lillicrap,et al. Noise Contrastive Priors for Functional Uncertainty , 2018, UAI.
[72] Sho Yaida,et al. Fluctuation-dissipation relations for stochastic gradient descent , 2018, ICLR.
[73] Laurence Aitchison,et al. Deep Convolutional Networks as shallow Gaussian Processes , 2018, ICLR.
[74] Hiroshi Inoue,et al. Multi-Sample Dropout for Accelerated Training and Better Generalization , 2019, ArXiv.
[75] A. Bhattacharya,et al. Bayesian fractional posteriors , 2016, The Annals of Statistics.
[76] Levent Sagun,et al. A Tail-Index Analysis of Stochastic Gradient Noise in Deep Neural Networks , 2019, ICML.
[77] Nal Kalchbrenner,et al. Bayesian Inference for Large Scale Image Classification , 2019, ArXiv.
[78] Guodong Zhang,et al. Functional Variational Bayesian Neural Networks , 2019, ICLR.
[79] Richard E. Turner,et al. Practical Deep Learning with Bayesian Principles , 2019, Neural Information Processing Systems.
[80] Padhraic Smyth,et al. Dropout as a Structured Shrinkage Prior , 2018, ICML.
[81] Greg Yang,et al. Scaling Limits of Wide Neural Networks with Weight Sharing: Gaussian Process Behavior, Gradient Independence, and Neural Tangent Kernel Derivation , 2019, ArXiv.
[82] Jaehoon Lee,et al. Bayesian Deep Convolutional Networks with Many Channels are Gaussian Processes , 2018, ICLR.
[83] Arno Solin,et al. Applied Stochastic Differential Equations , 2019 .
[84] Nicola Marzari,et al. Bayesian Neural Networks at Finite Temperature , 2019, ArXiv.
[85] Tim Pearce,et al. Expressive Priors in Bayesian Neural Networks: Kernel Combinations and Periodic Functions , 2019, UAI.
[86] Jimmy Ba,et al. BatchEnsemble: Efficient Ensemble of Deep Neural Networks via Rank-1 Perturbation , 2019 .
[87] Andrew Gordon Wilson,et al. Cyclical Stochastic Gradient MCMC for Bayesian Deep Learning , 2019, ICLR.
[88] Saurabh Singh,et al. Filter Response Normalization Layer: Eliminating Batch Dependence in the Training of Deep Neural Networks , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[89] Andrew Gordon Wilson,et al. The Case for Bayesian Deep Learning , 2020, ArXiv.
[90] Andres R. Masegosa,et al. Learning under Model Misspecification: Applications to Variational and Ensemble methods , 2019, NeurIPS.
[91] Junpeng Lao,et al. tfp.mcmc: Modern Markov Chain Monte Carlo Tools Built for Modern Hardware , 2020, ArXiv.
[92] Pavel Izmailov,et al. Bayesian Deep Learning and a Probabilistic Perspective of Generalization , 2020, NeurIPS.
[93] Dmitry Vetrov,et al. Pitfalls of In-Domain Uncertainty Estimation and Ensembling in Deep Learning , 2020, ICLR.