Advances in Variational Inference

Many modern unsupervised or semi-supervised machine learning algorithms rely on Bayesian probabilistic models. These models are usually intractable and thus require approximate inference. Variational inference (VI) lets us approximate a high-dimensional Bayesian posterior with a simpler variational distribution by solving an optimization problem. This approach has been successfully applied to various models and large-scale applications. In this review, we give an overview of recent trends in variational inference. We first introduce standard mean field variational inference, then review recent advances focusing on the following aspects: (a) scalable VI, which includes stochastic approximations, (b) generic VI, which extends the applicability of VI to a large class of otherwise intractable models, such as non-conjugate models, (c) accurate VI, which includes variational models beyond the mean field approximation or with atypical divergences, and (d) amortized VI, which implements the inference over local latent variables with inference networks. Finally, we provide a summary of promising future research directions.

[1]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[2]  Robert V. Hogg,et al.  Introduction to Mathematical Statistics. , 1966 .

[3]  S. M. Ali,et al.  A General Class of Coefficients of Divergence of One Distribution from Another , 1966 .

[4]  W. A. Ericson Introduction to Mathematical Statistics, 4th Edition , 1972 .

[5]  C. Stein A bound for the error in the normal approximation to the distribution of a sum of dependent random variables , 1972 .

[6]  Miss A.O. Penney (b) , 1974, The New Yale Book of Quotations.

[7]  P. Boyle Options: A Monte Carlo approach , 1977 .

[8]  R. Palmer,et al.  Solution of 'Solvable model of a spin glass' , 1977 .

[9]  G. Reinsel,et al.  Introduction to Mathematical Statistics (4th ed.). , 1980 .

[10]  T. Plefka Convergence condition of the TAP equation for the infinite-ranged Ising spin glass model , 1982 .

[11]  Shun-ichi Amari,et al.  Differential-geometrical methods in statistics , 1985 .

[12]  L. Tierney,et al.  Accurate Approximations for Posterior Moments and Marginal Densities , 1986 .

[13]  L. L. Cam,et al.  Asymptotic Methods In Statistical Decision Theory , 1986 .

[14]  P. Laplace Memoir on the Probability of the Causes of Events , 1986 .

[15]  Carsten Peterson,et al.  A Mean Field Theory Learning Algorithm for Neural Networks , 1987, Complex Syst..

[16]  M. Mézard,et al.  Spin Glass Theory and Beyond , 1987 .

[17]  Geoffrey C. Fox,et al.  A deterministic annealing approach to clustering , 1990, Pattern Recognit. Lett..

[18]  Ross D. Shachter,et al.  Laplace's Method Approximations for Probabilistic Inference in Belief Networks with Continuous Variables , 1994, UAI.

[19]  Michael I. Jordan,et al.  MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .

[20]  Huaiyu Zhu,et al.  Information geometric measurements of generalisation , 1995 .

[21]  Geoffrey E. Hinton,et al.  The Helmholtz Machine , 1995, Neural Computation.

[22]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[23]  Michael I. Jordan,et al.  Mean Field Theory for Sigmoid Belief Networks , 1996, J. Artif. Intell. Res..

[24]  S. Eddy Hidden Markov models. , 1996, Current opinion in structural biology.

[25]  Ole Winther,et al.  A Mean Field Algorithm for Bayes Learning in Large Feed-forward Neural Networks , 1996, NIPS.

[26]  Huaiyu Zhu On Information and Sufficiency , 1997 .

[27]  A. Müller Integral Probability Metrics and Their Generating Classes of Functions , 1997, Advances in Applied Probability.

[28]  T. Jaakkola,et al.  Improving the Mean Field Approximation Via the Use of Mixture Distributions , 1999, Learning in Graphical Models.

[29]  Shun-ichi Amari,et al.  Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.

[30]  Toshiyuki Tanaka,et al.  Estimation of Third-Order Correlations within Mean Field Approximation , 1998, ICONIP.

[31]  Michael I. Jordan Graphical Models , 2003 .

[32]  Toshiyuki Tanaka,et al.  A Theory of Mean Field Approximation , 1998, NIPS.

[33]  Michael I. Jordan,et al.  Loopy Belief Propagation for Approximate Inference: An Empirical Study , 1999, UAI.

[34]  Toshiyuki Tanaka,et al.  Information Geometry of Mean-Field Approximation , 2000, Neural Computation.

[35]  Hilbert J. Kappen,et al.  Second Order Approximations for Probability Models , 2000, NIPS.

[36]  Naftali Tishby,et al.  The information bottleneck method , 2000, ArXiv.

[37]  Tom Minka,et al.  Expectation Propagation for approximate Bayesian inference , 2001, UAI.

[38]  M. Opper,et al.  Tractable approximations for probabilistic models: the adaptive Thouless-Anderson-Palmer mean field approach. , 2001, Physical review letters.

[39]  M. Opper,et al.  Advanced mean field methods: theory and practice , 2001 .

[40]  Masa-aki Sato,et al.  Online Model Selection Based on the Variational Bayes , 2001, Neural Computation.

[41]  Antti Honkela,et al.  On-line Variational Bayesian Learning , 2003 .

[42]  David Barber,et al.  The IM algorithm: a variational approach to Information Maximization , 2003, NIPS 2003.

[43]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[44]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[45]  T. Minka Power EP , 2004 .

[46]  Charles M. Bishop,et al.  Variational Message Passing , 2005, J. Mach. Learn. Res..

[47]  M. Seeger Expectation Propagation for Exponential Families , 2005 .

[48]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[49]  Thomas P. Minka,et al.  Divergence measures and message passing , 2005 .

[50]  Zoubin Ghahramani,et al.  Sparse Gaussian Processes using Pseudo-inputs , 2005, NIPS.

[51]  John D. Lafferty,et al.  Correlated Topic Models , 2005, NIPS.

[52]  Martin J. Wainwright,et al.  A new class of upper bounds on the log partition function , 2002, IEEE Transactions on Information Theory.

[53]  Yee Whye Teh,et al.  A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation , 2006, NIPS.

[54]  John D. Lafferty,et al.  Dynamic topic models , 2006, ICML.

[55]  Neil D. Lawrence,et al.  Fast Variational Inference for Gaussian Process Models Through KL-Correction , 2006, ECML.

[56]  H. Robbins A Stochastic Approximation Method , 1951 .

[57]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[58]  Yee Whye Teh,et al.  Collapsed Variational Dirichlet Process Mixture Models , 2007, IJCAI.

[59]  Juha Karhunen,et al.  Natural Conjugate Gradient in Variational Inference , 2007, ICONIP.

[60]  William W. Cohen,et al.  Parallelized Variational EM for Latent Dirichlet Allocation: An Experimental Evaluation of Speed and Scalability , 2007, Seventh IEEE International Conference on Data Mining Workshops (ICDMW 2007).

[61]  Shinichi Nakajima,et al.  Variational Bayes Solution of Linear Neural Networks and Its Generalization Performance , 2007, Neural Computation.

[62]  Max Welling,et al.  Fast collapsed gibbs sampling for latent dirichlet allocation , 2008, KDD.

[63]  Zoubin Ghahramani,et al.  Latent-Space Variational Bayes , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[64]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[65]  Shun-ichi Amari,et al.  $\alpha$ -Divergence Is Unique, Belonging to Both $f$-Divergence and Bregman Divergence Classes , 2009, IEEE Transactions on Information Theory.

[66]  Kenji Fukumizu,et al.  On integral probability metrics, φ-divergences and binary classification , 2009, 0901.2698.

[67]  Chong Wang,et al.  Simultaneous image classification and annotation , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[68]  Gregor Heinrich Parameter estimation for text analysis , 2009 .

[69]  Michalis K. Titsias,et al.  Variational Learning of Inducing Variables in Sparse Gaussian Processes , 2009, AISTATS.

[70]  Francis R. Bach,et al.  Online Learning for Latent Dirichlet Allocation , 2010, NIPS.

[71]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[72]  Juha Karhunen,et al.  Approximate Riemannian Conjugate Gradient Learning for Fixed-Form Variational Bayes , 2010, J. Mach. Learn. Res..

[73]  Léon Bottou,et al.  Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.

[74]  Radford M. Neal Probabilistic Inference Using Markov Chain Monte Carlo Methods , 2011 .

[75]  Zoubin Ghahramani,et al.  Approximate inference for the loss-calibrated Bayesian , 2011, AISTATS.

[76]  Tom Minka,et al.  Non-conjugate Variational Message Passing for Multinomial and Binary Regression , 2011, NIPS.

[77]  Andrew Gelman,et al.  Handbook of Markov Chain Monte Carlo , 2011 .

[78]  Chong Wang,et al.  Online Variational Inference for the Hierarchical Dirichlet Process , 2011, AISTATS.

[79]  Yee Whye Teh,et al.  Bayesian Learning via Stochastic Gradient Langevin Dynamics , 2011, ICML.

[80]  Miguel Lázaro-Gredilla,et al.  Variational Heteroscedastic Gaussian Process Regression , 2011, ICML.

[81]  Jordan L. Boyd-Graber,et al.  Mr. LDA: a flexible large scale topic modeling package using variational inference in MapReduce , 2012, WWW.

[82]  Neil D. Lawrence,et al.  Overlapping Mixtures of Gaussian Processes for the Data Association Problem , 2011, Pattern Recognit..

[83]  Ahn,et al.  Bayesian posterior sampling via stochastic gradient Fisher scoring Bayesian Posterior Sampling via Stochastic Gradient Fisher Scoring , 2012 .

[84]  W. Marsden I and J , 2012 .

[85]  Neil D. Lawrence,et al.  Fast Variational Inference in the Conjugate Exponential Family , 2012, NIPS.

[86]  Ben Taskar,et al.  Determinantal Point Processes for Machine Learning , 2012, Found. Trends Mach. Learn..

[87]  David M. Blei,et al.  Nonparametric variational inference , 2012, ICML.

[88]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[89]  Michael J. Freedman,et al.  Scalable Inference of Overlapping Communities , 2012, NIPS.

[90]  Michael I. Jordan,et al.  Variational Bayesian Inference with Stochastic Search , 2012, ICML.

[91]  Mark W. Schmidt,et al.  Hybrid Deterministic-Stochastic Methods for Data Fitting , 2011, SIAM J. Sci. Comput..

[92]  Jorge Nocedal,et al.  Sample size selection in optimization methods for machine learning , 2012, Math. Program..

[93]  Tim Salimans,et al.  Fixed-Form Variational Posterior Approximation through Stochastic Linear Regression , 2012, ArXiv.

[94]  Hedvig Kjellström,et al.  Supervised Hierarchical Dirichlet Processes with Variational Inference , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[95]  Tong Zhang,et al.  Accelerating Stochastic Gradient Descent using Predictive Variance Reduction , 2013, NIPS.

[96]  Xi Chen,et al.  Variance Reduction for Stochastic Gradient Optimization , 2013, NIPS.

[97]  Chong Wang,et al.  An Adaptive Learning Rate for Stochastic Variational Inference , 2013, ICML.

[98]  Neil D. Lawrence,et al.  Deep Gaussian Processes , 2012, AISTATS.

[99]  Chong Wang,et al.  Stochastic variational inference , 2012, J. Mach. Learn. Res..

[100]  Chong Wang,et al.  Variational inference in nonconjugate models , 2012, J. Mach. Learn. Res..

[101]  Sergey Levine,et al.  Variational Policy Search via Trajectory Optimization , 2013, NIPS.

[102]  Andre Wibisono,et al.  Streaming Variational Bayes , 2013, NIPS.

[103]  Neil D. Lawrence,et al.  Gaussian Processes for Big Data , 2013, UAI.

[104]  Diederik P. Kingma,et al.  Stochastic Gradient VB and the Variational Auto-Encoder , 2013 .

[105]  Ole Winther,et al.  Perturbative corrections for approximate inference in Gaussian latent variable models , 2013, J. Mach. Learn. Res..

[106]  Matthew J. Johnson,et al.  Stochastic Variational Inference for Bayesian Time Series Models , 2014, ICML.

[107]  R. Horgan,et al.  Statistical Field Theory , 2014 .

[108]  Tong Zhang,et al.  Accelerating Minibatch Stochastic Gradient Descent using Stratified Sampling , 2014, ArXiv.

[109]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[110]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[111]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[112]  N. Chopin,et al.  Control functionals for Monte Carlo integration , 2014, 1410.2392.

[113]  Carl E. Rasmussen,et al.  Distributed Variational Inference in Sparse Gaussian Process Regression and Latent Variable Models , 2014, NIPS.

[114]  Sean Gerrish,et al.  Black Box Variational Inference , 2013, AISTATS.

[115]  Noah D. Goodman,et al.  Amortized Inference in Probabilistic Reasoning , 2014, CogSci.

[116]  David M. Blei,et al.  Smoothed Gradients for Stochastic Variational Inference , 2014, NIPS.

[117]  Jason Xu,et al.  Stochastic variational inference for hidden Markov models , 2014, NIPS.

[118]  David M. Blei,et al.  Structured Stochastic Variational Inference , 2014, 1404.4114.

[119]  David A. Knowles Stochastic gradient variational Bayes for gamma approximating distributions , 2015, 1509.01631.

[120]  Max Welling,et al.  Variational Dropout and the Local Reparameterization Trick , 2015, NIPS 2015.

[121]  Shakir Mohamed,et al.  Variational Inference with Normalizing Flows , 2015, ICML.

[122]  Sholeh Forouzan,et al.  Approximate Inference in Graphical Models , 2015 .

[123]  David M. Blei,et al.  The Population Posterior and Bayesian Modeling on Streams , 2015, NIPS.

[124]  Chong Wang,et al.  Embarrassingly Parallel Variational Inference in Nonconjugate Models , 2015, ArXiv.

[125]  Zoubin Ghahramani,et al.  Probabilistic machine learning and artificial intelligence , 2015, Nature.

[126]  Sergey Levine,et al.  Trust Region Policy Optimization , 2015, ICML.

[127]  Max Welling,et al.  Markov Chain Monte Carlo and Variational Inference: Bridging the Gap , 2014, ICML.

[128]  Matthew D. Hoffman,et al.  A trust-region method for stochastic variational inference with applications to streaming data , 2015, ICML.

[129]  David M. Blei,et al.  Population Empirical Bayes , 2014, UAI.

[130]  Tong Zhang,et al.  Stochastic Optimization with Importance Sampling for Regularized Loss Minimization , 2014, ICML.

[131]  Uri Shalit,et al.  Deep Kalman Filters , 2015, ArXiv.

[132]  Pascal Fua,et al.  Kullback-Leibler Proximal Variational Inference , 2015, NIPS.

[133]  Alex Graves,et al.  DRAW: A Recurrent Neural Network For Image Generation , 2015, ICML.

[134]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[135]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[136]  Zoubin Ghahramani,et al.  An Empirical Study of Stochastic Variational Algorithms for the Beta Bernoulli Process , 2015, ICML 2015.

[137]  Masashi Sugiyama,et al.  Bayesian Dark Knowledge , 2015 .

[138]  Edoardo M. Airoldi,et al.  Copula variational inference , 2015, NIPS.

[139]  Miguel Lázaro-Gredilla,et al.  Local Expectation Gradients for Black Box Variational Inference , 2015, NIPS.

[140]  Alex K. Susemihl,et al.  Perturbation Theory for Variational Inference , 2015 .

[141]  Yoshua Bengio,et al.  NICE: Non-linear Independent Components Estimation , 2014, ICLR.

[142]  Richard E. Turner,et al.  Stochastic Expectation Propagation , 2015, NIPS.

[143]  David B. Dunson,et al.  Variational Gaussian Copula Inference , 2015, AISTATS.

[144]  Qiang Liu Wild Variational Approximations , 2016 .

[145]  Farhan Abrol,et al.  Variational Tempering , 2016, AISTATS.

[146]  Qiang Liu,et al.  Learning to Draw Samples: With Application to Amortized MLE for Generative Adversarial Learning , 2016, ArXiv.

[147]  Shakir Mohamed,et al.  Learning in Implicit Generative Models , 2016, ArXiv.

[148]  David M. Blei,et al.  A Variational Analysis of Stochastic Gradient Algorithms , 2016, ICML.

[149]  Ole Winther,et al.  How to Train Deep Variational Autoencoders and Probabilistic Ladder Networks , 2016, ICML 2016.

[150]  Yarin Gal,et al.  Uncertainty in Deep Learning , 2016 .

[151]  Ruslan Salakhutdinov,et al.  Importance Weighted Autoencoders , 2015, ICLR.

[152]  U. V. Luxburg,et al.  Improving Variational Autoencoders with Inverse Autoregressive Flow , 2016 .

[153]  Theofanis Karaletsos,et al.  Adversarial Message Passing For Graphical Models , 2016, ArXiv.

[154]  Ole Winther,et al.  Ladder Variational Autoencoders , 2016, NIPS.

[155]  Qiang Liu,et al.  A Kernelized Stein Discrepancy for Goodness-of-fit Tests , 2016, ICML.

[156]  Xiangyu Wang,et al.  Boosting Variational Inference , 2016, ArXiv.

[157]  Cheng Zhang,et al.  Structured Representation Using Latent Variable Models , 2016 .

[158]  Ryan P. Adams,et al.  Structured VAEs: Composing Probabilistic Graphical Models and Variational Autoencoders , 2016 .

[159]  David M. Blei,et al.  Variational Inference: A Review for Statisticians , 2016, ArXiv.

[160]  Phil Blunsom,et al.  Neural Variational Inference for Text Processing , 2015, ICML.

[161]  Dilin Wang,et al.  Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm , 2016, NIPS.

[162]  Shandian Zhe Online Spike-and-slab Inference with Stochastic Expectation Propagation , 2016 .

[163]  David M. Blei,et al.  The Generalized Reparameterization Gradient , 2016, NIPS.

[164]  Richard E. Turner,et al.  Rényi Divergence Variational Inference , 2016, NIPS.

[165]  Dustin Tran,et al.  Operator Variational Inference , 2016, NIPS.

[166]  Ryan P. Adams,et al.  Early Stopping as Nonparametric Variational Inference , 2015, AISTATS.

[167]  Richard E. Turner,et al.  A Unifying Framework for Sparse Gaussian Process Approximation using Power Expectation Propagation , 2016, ArXiv.

[168]  Neil D. Lawrence,et al.  Variational Auto-encoded Deep Gaussian Processes , 2015, ICLR.

[169]  Dustin Tran,et al.  Edward: A library for probabilistic modeling, inference, and criticism , 2016, ArXiv.

[170]  Samy Bengio,et al.  Generating Sentences from a Continuous Space , 2015, CoNLL.

[171]  David Tolpin,et al.  Design and Implementation of Probabilistic Programming Language Anglican , 2016, IFL 2016.

[172]  Daniel Hernández-Lobato,et al.  Black-Box Alpha Divergence Minimization , 2015, ICML.

[173]  Dustin Tran,et al.  Variational Gaussian Process , 2015, ICLR.

[174]  David W. Jacobs,et al.  Big Batch SGD: Automated Inference using Adaptive Batch Sizes , 2016, ArXiv.

[175]  Ryan P. Adams,et al.  Patterns of Scalable Bayesian Inference , 2016, Found. Trends Mach. Learn..

[176]  Dustin Tran,et al.  Hierarchical Variational Models , 2015, ICML.

[177]  David M. Blei,et al.  Overdispersed Black-Box Variational Inference , 2016, UAI.

[178]  Samy Bengio,et al.  Density estimation using Real NVP , 2016, ICLR.

[179]  Scott W. Linderman,et al.  Reparameterization Gradients through Acceptance-Rejection Sampling Algorithms , 2016, AISTATS.

[180]  Charles A. Sutton,et al.  Autoencoding Variational Inference For Topic Models , 2017, ICLR.

[181]  Jun Zhu,et al.  ZhuSuan: A Library for Bayesian Deep Learning , 2017, ArXiv.

[182]  Manfred Opper,et al.  Perturbative Black Box Variational Inference , 2017, NIPS.

[183]  Marco Cote STICK-BREAKING VARIATIONAL AUTOENCODERS , 2017 .

[184]  Le Song,et al.  Variational Policy for Guiding Point Processes , 2017, ICML.

[185]  Sebastian Nowozin,et al.  Adversarial Variational Bayes: Unifying Variational Autoencoders and Generative Adversarial Networks , 2017, ICML.

[186]  Javier Romero,et al.  Coupling Adaptive Batch Sizes with Learning Rates , 2016, UAI.

[187]  Ben Poole,et al.  Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.

[188]  Yang Liu,et al.  Stein Variational Policy Gradient , 2017, UAI.

[189]  Dustin Tran,et al.  Hierarchical Implicit Models and Likelihood-Free Variational Inference , 2017, NIPS.

[190]  Ferenc Huszár,et al.  Variational Inference using Implicit Distributions , 2017, ArXiv.

[191]  David Vázquez,et al.  PixelVAE: A Latent Variable Model for Natural Images , 2016, ICLR.

[192]  David M. Blei,et al.  Robust Probabilistic Modeling with Bayesian Data Reweighting , 2016, ICML.

[193]  Sebastian Nowozin,et al.  The Numerics of GANs , 2017, NIPS.

[194]  Zhihua Zhang,et al.  CPSG-MCMC: Clustering-Based Preprocessing method for Stochastic Gradient MCMC , 2017, AISTATS.

[195]  Volkan Cevher,et al.  Faster Coordinate Descent via Adaptive Importance Sampling , 2017, AISTATS.

[196]  Yee Whye Teh,et al.  The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.

[197]  Ryan P. Adams,et al.  Variational Boosting: Iteratively Refining Posterior Approximations , 2016, ICML.

[198]  Stefano Ermon,et al.  Towards Deeper Understanding of Variational Autoencoding Models , 2017, ArXiv.

[199]  Alexander D'Amour,et al.  Reducing Reparameterization Gradient Variance , 2017, NIPS.

[200]  Pieter Abbeel,et al.  Variational Lossy Autoencoder , 2016, ICLR.

[201]  Stephan Mandt,et al.  Structured Black Box Variational Inference for Latent Time Series Models , 2017, ArXiv.

[202]  Jason Tyler Rolfe,et al.  Discrete Variational Autoencoders , 2016, ICLR.

[203]  Linda S. L. Tan Stochastic variational inference for large-scale discrete choice models using adaptive batch sizes , 2014, Stat. Comput..

[204]  Qiang Liu,et al.  Stein Variational Adaptive Importance Sampling , 2017, UAI.

[205]  David Duvenaud,et al.  Reinterpreting Importance-Weighted Autoencoders , 2017, ICLR.

[206]  Hedvig Kjellstrom,et al.  Determinantal Point Processes for Mini-Batch Diversification , 2017, UAI 2017.

[207]  Jiqiang Guo,et al.  Stan: A Probabilistic Programming Language. , 2017, Journal of statistical software.

[208]  Tom Heskes Expectation Propagation , 2017, Encyclopedia of Machine Learning and Data Mining.

[209]  Qiang Liu,et al.  Approximate Inference with Amortised MCMC , 2017, ArXiv.

[210]  Richard E. Turner,et al.  A Unifying Framework for Gaussian Process Pseudo-Point Approximations using Power Expectation Propagation , 2016, J. Mach. Learn. Res..

[211]  Alexander A. Alemi,et al.  Deep Variational Information Bottleneck , 2017, ICLR.

[212]  Jascha Sohl-Dickstein,et al.  REBAR: Low-variance, unbiased gradient estimates for discrete latent variable models , 2017, NIPS.

[213]  Qiang Liu,et al.  Two Methods for Wild Variational Inference , 2016, 1612.00081.

[214]  Stephan Mandt,et al.  Dynamic Word Embeddings , 2017, ICML.

[215]  Dustin Tran,et al.  Variational Inference via \chi Upper Bound Minimization , 2016, NIPS.

[216]  Yisong Yue,et al.  Factorized Variational Autoencoders for Modeling Audience Reactions to Movies , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[217]  David Duvenaud,et al.  Sticking the Landing: An Asymptotically Zero-Variance Gradient Estimator for Variational Inference , 2017, ArXiv.

[218]  David M. Blei,et al.  Stochastic Gradient Descent as Approximate Bayesian Inference , 2017, J. Mach. Learn. Res..

[219]  Michalis K. Titsias,et al.  Learning Model Reparametrizations: Implicit Variational Inference by Fitting MCMC distributions , 2017, 1708.01529.

[220]  Mingjun Zhong,et al.  Efficient Gradient-Free Variational Inference using Policy Search , 2018, ICML.

[221]  Ryan P. Adams,et al.  Multimodal Prediction and Personalization of Photo Edits with Deep Generative Models , 2017, AISTATS.

[222]  Mohammad Emtiyaz Khan,et al.  Variational Message Passing with Structured Inference Networks , 2018, ICLR.

[223]  Didrik Nielsen,et al.  Fast and Scalable Bayesian Deep Learning by Weight-Perturbation in Adam , 2018, ICML.

[224]  Yisong Yue,et al.  Iterative Amortized Inference , 2018, ICML.

[225]  David M. Blei,et al.  Frequentist Consistency of Variational Bayes , 2017, Journal of the American Statistical Association.

[226]  Peter Richtárik,et al.  Importance Sampling for Minibatches , 2016, J. Mach. Learn. Res..

[227]  David Duvenaud,et al.  Inference Suboptimality in Variational Autoencoders , 2018, ICML.

[228]  Sebastian Nowozin,et al.  Debiasing Evidence Approximations: On Importance-weighted Autoencoders and Jackknife Variational Inference , 2018, ICLR.

[229]  Sergey Levine,et al.  Reinforcement Learning and Control as Probabilistic Inference: Tutorial and Review , 2018, ArXiv.

[230]  Yee Whye Teh,et al.  Tighter Variational Bounds are Not Necessarily Better , 2018, ICML.

[231]  Mingyuan Zhou,et al.  Semi-Implicit Variational Inference , 2018, ICML.

[232]  Stephan Mandt,et al.  Quasi-Monte Carlo Variational Inference , 2018, ICML.

[233]  Yingzhen Li,et al.  Approximate inference: new visions , 2018 .

[234]  Hongseok Yang,et al.  On Nesting Monte Carlo Estimators , 2017, ICML.

[235]  Cheng Zhang,et al.  Active Mini-Batch Sampling using Repulsive Point Processes , 2018, AAAI.

[236]  이성수,et al.  Simulation , 2006, Healthcare Simulation at a Glance.

[237]  G. Sze,et al.  Hypophysitis: endocrinologic and dynamic MR findings. , 1998, AJNR. American journal of neuroradiology.

[238]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[239]  P. Alam ‘A’ , 2021, Composites Engineering: An A–Z Guide.