暂无分享,去创建一个
David Duvenaud | Chris J. Maddison | Kevin Swersky | Milad Hashemi | Will Grathwohl | D. Duvenaud | Will Grathwohl | Kevin Swersky | Milad Hashemi
[1] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[2] Jun S. Liu. Peskun's theorem and a modified discrete-state Gibbs sampler , 1996 .
[3] Yang Song,et al. Generative Modeling by Estimating Gradients of the Data Distribution , 2019, NeurIPS.
[4] S. Richardson,et al. Bayesian Models for Sparse Regression Analysis of High Dimensional Data , 2012 .
[5] Mohammad Norouzi,et al. Your Classifier is Secretly an Energy Based Model and You Should Treat it Like One , 2019, ICLR.
[6] Zhijian Ou,et al. Learning Neural Random Fields with Inclusive Auxiliary Generators , 2018, ArXiv.
[7] Quoc V. Le,et al. Searching for Activation Functions , 2018, arXiv.
[8] G. Stormo,et al. Correlated mutations in models of protein sequences: phylogenetic and structural effects , 1999 .
[9] Yee Whye Teh,et al. The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.
[10] 이상헌,et al. Deep Belief Networks , 2010, Encyclopedia of Machine Learning.
[11] Aapo Hyvärinen,et al. Noise-contrastive estimation: A new estimation principle for unnormalized statistical models , 2010, AISTATS.
[12] Radford M. Neal. MCMC Using Hamiltonian Dynamics , 2011, 1206.1901.
[13] Caiming Xiong,et al. Joint Energy-based Model Training for Better Calibrated Natural Language Understanding Models , 2021, ArXiv.
[14] Umrigar,et al. Accelerated Metropolis method. , 1993, Physical review letters.
[15] J. Besag. Statistical Analysis of Non-Lattice Data , 1975 .
[16] Geoffrey E. Hinton. Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.
[17] Qiang Liu,et al. Stein Variational Inference for Discrete Distributions , 2020, AISTATS.
[18] Ruslan Salakhutdinov,et al. Accurate and conservative estimates of MRF log-likelihood using reverse annealing , 2014, AISTATS.
[19] J. Rosenthal,et al. Adaptive Gibbs samplers and related MCMC methods , 2011, 1101.5838.
[20] Tijmen Tieleman,et al. Training restricted Boltzmann machines using approximations to the likelihood gradient , 2008, ICML '08.
[21] Debora S. Marks,et al. Variational Inference for Sparse and Undirected Models , 2016, ICML.
[22] Michael I. Miller,et al. REPRESENTATIONS OF KNOWLEDGE IN COMPLEX SYSTEMS , 1994 .
[23] Myle Ott,et al. Residual Energy-Based Models for Text Generation , 2020, ICLR.
[24] Geoffrey E. Hinton,et al. Using fast weights to improve persistent contrastive divergence , 2009, ICML '09.
[25] Jürgen Schmidhuber,et al. Learning to Forget: Continual Prediction with LSTM , 2000, Neural Computation.
[26] Max Welling,et al. VAE with a VampPrior , 2017, AISTATS.
[27] Radford M. Neal. Annealed importance sampling , 1998, Stat. Comput..
[28] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[29] Igor Mordatch,et al. Implicit Generation and Generalization with Energy Based Models , 2018 .
[30] V. Climenhaga. Markov chains and mixing times , 2013 .
[31] Thomas A. Hopf,et al. Protein 3D Structure Computed from Evolutionary Sequence Variation , 2011, PloS one.
[32] Song-Chun Zhu,et al. Stochastic Security: Adversarial Defense Using Long-Run Dynamics of Energy-Based Models , 2020, ICLR.
[33] Qiang Liu,et al. Stein Variational Gradient Descent Without Gradient , 2018, ICML.
[34] Bernhard Schölkopf,et al. A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..
[35] D. Dunson,et al. Discontinuous Hamiltonian Monte Carlo for sampling discrete parameters , 2017 .
[36] Giacomo Zanella,et al. Informed Proposals for Local MCMC in Discrete Spaces , 2017, Journal of the American Statistical Association.
[37] Siwei Lyu,et al. Interpretation and Generalization of Score Matching , 2009, UAI.
[38] Dilin Wang,et al. Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm , 2016, NIPS.
[39] Richard Zemel,et al. Learning the Stein Discrepancy for Training and Evaluating Energy-Based Models without Sampling , 2020, ICML.
[40] Tian Han,et al. On the Anatomy of MCMC-based Maximum Likelihood Learning of Energy-Based Models , 2019, AAAI.
[41] Beatrice Santorini,et al. The Penn Treebank: An Overview , 2003 .
[42] Christopher Yau,et al. The Hamming Ball Sampler , 2015, Journal of the American Statistical Association.
[43] Mohammad Norouzi,et al. No MCMC for me: Amortized sampling for fast and stable training of energy-based models , 2021, ICLR.
[44] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.