GFlowNets and variational inference

This paper builds bridges between two families of probabilistic algorithms: (hierarchical) variational inference (VI), which is typically used to model distributions over continuous spaces, and generative flow networks (GFlowNets), which have been used for distributions over discrete structures such as graphs. We demonstrate that, in certain cases, VI algorithms are equivalent to special cases of GFlowNets in the sense of equality of expected gradients of their learning objectives. We then point out the differences between the two families and show how these differences emerge experimentally. Notably, GFlowNets, which borrow ideas from reinforcement learning, are more amenable than VI to off-policy training without the cost of high gradient variance induced by importance sampling. We argue that this property of GFlowNets can provide advantages for capturing diversity in multimodal target distributions.

[1]  Y. Bengio,et al.  A theory of continuous generative flow networks , 2023, ArXiv.

[2]  C. A. Naesseth,et al.  A Variational Perspective on Generative Flow Networks , 2022, Trans. Mach. Learn. Res..

[3]  Emmanuel Bengio,et al.  Learning GFlowNets from partial episodes for improved convergence and stability , 2022, arXiv.org.

[4]  Ricky T. Q. Chen,et al.  Unifying Generative Models with GFlowNets , 2022, ArXiv.

[5]  Bonaventure F. P. Dossou,et al.  Biological Sequence Design with GFlowNets , 2022, ICML.

[6]  Chris C. Emezue,et al.  Bayesian Structure Learning with Generative Flow Networks , 2022, UAI.

[7]  Aaron C. Courville,et al.  Generative Flow Networks for Discrete Probabilistic Modeling , 2022, ICML.

[8]  Chen Sun,et al.  Trajectory Balance: Improved Credit Assignment in GFlowNets , 2022, NeurIPS.

[9]  Luke B. Hewitt,et al.  Hybrid Memoised Wake-Sleep: Approximate Inference at the Discrete-Continuous Interface , 2021, ICLR.

[10]  Ricky T. Q. Chen,et al.  Unifying Generative Models with GFlowNets , 2022, ArXiv.

[11]  Hao Wu,et al.  Nested Variational Inference , 2021, NeurIPS.

[12]  Doina Precup,et al.  Flow Network based Generative Models for Non-Iterative Diverse Candidate Generation , 2021, NeurIPS.

[13]  Rewon Child,et al.  Very Deep VAEs Generalize Autoregressive Models and Can Outperform Them on Images , 2020, ICLR.

[14]  Francisco J. R. Ruiz,et al.  VarGrad: A Low-Variance Gradient Estimator for Variational Inference , 2020, NeurIPS.

[15]  Naira Hovakimyan,et al.  f-Divergence Variational Inference , 2020, NeurIPS.

[16]  Jan Kautz,et al.  NVAE: A Deep Hierarchical Variational Autoencoder , 2020, NeurIPS.

[17]  Joshua B. Tenenbaum,et al.  Learning to learn generative programs with Memoised Wake-Sleep , 2020, UAI.

[18]  Pieter Abbeel,et al.  Denoising Diffusion Probabilistic Models , 2020, NeurIPS.

[19]  David Barber,et al.  Variational f-divergence Minimization , 2019, ArXiv.

[20]  Frank D. Wood,et al.  The Thermodynamic Variational Objective , 2019, NeurIPS.

[21]  Dmitry Vetrov,et al.  Importance Weighted Hierarchical Variational Inference , 2019, NeurIPS.

[22]  Ole Winther,et al.  BIVA: A Very Deep Hierarchy of Latent Variables for Generative Modeling , 2019, NeurIPS.

[23]  Yee Whye Teh,et al.  Revisiting Reweighted Wake-Sleep for Models with Stochastic Control Flow , 2018, UAI.

[24]  Hedvig Kjellström,et al.  Advances in Variational Inference , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Hao Liu,et al.  Variational Inference with Tail-adaptive f-Divergence , 2018, NeurIPS.

[26]  Justin Domke,et al.  Importance Weighting and Variational Inference , 2018, NeurIPS.

[27]  Razvan Pascanu,et al.  Relational inductive biases, deep learning, and graph networks , 2018, ArXiv.

[28]  Mingyuan Zhou,et al.  Semi-Implicit Variational Inference , 2018, ICML.

[29]  Alexandre M. Bayen,et al.  Variance Reduction for Policy Gradient with Action-Dependent Factorized Baselines , 2018, ICLR.

[30]  Yee Whye Teh,et al.  Tighter Variational Bounds are Not Necessarily Better , 2018, ICML.

[31]  Regina Barzilay,et al.  Junction Tree Variational Autoencoder for Molecular Graph Generation , 2018, ICML.

[32]  David Duvenaud,et al.  Backpropagation through the Void: Optimizing control variates for black-box gradient estimation , 2017, ICLR.

[33]  Jascha Sohl-Dickstein,et al.  REBAR: Low-variance, unbiased gradient estimates for discrete latent variable models , 2017, NIPS.

[34]  Frank Hutter,et al.  SGDR: Stochastic Gradient Descent with Warm Restarts , 2016, ICLR.

[35]  Adji B. Dieng,et al.  Variational Inference via χ Upper Bound Minimization , 2017 .

[36]  Dustin Tran,et al.  Operator Variational Inference , 2016, NIPS.

[37]  Dilin Wang,et al.  Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm , 2016, NIPS.

[38]  Andriy Mnih,et al.  Variational Inference for Monte Carlo Objectives , 2016, ICML.

[39]  Ole Winther,et al.  Ladder Variational Autoencoders , 2016, NIPS.

[40]  Daniel Hernández-Lobato,et al.  Black-Box Alpha Divergence Minimization , 2015, ICML.

[41]  Dustin Tran,et al.  Hierarchical Variational Models , 2015, ICML.

[42]  Ruslan Salakhutdinov,et al.  Importance Weighted Autoencoders , 2015, ICLR.

[43]  Richard E. Turner,et al.  Stochastic Expectation Propagation , 2015, NIPS.

[44]  Yoshua Bengio,et al.  Reweighted Wake-Sleep , 2014, ICLR.

[45]  Miguel Lázaro-Gredilla,et al.  Doubly Stochastic Variational Bayes for non-Conjugate Inference , 2014, ICML.

[46]  D. Heckerman,et al.  Addendum on the scoring of Gaussian directed acyclic graphical models , 2014, 1402.6863.

[47]  Karol Gregor,et al.  Neural Variational Inference and Learning in Belief Networks , 2014, ICML.

[48]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[49]  Sean Gerrish,et al.  Black Box Variational Inference , 2013, AISTATS.

[50]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[51]  Chong Wang,et al.  Stochastic variational inference , 2012, J. Mach. Learn. Res..

[52]  A. Voet,et al.  Fragment based drug design: from experimental to computational approaches. , 2012, Current medicinal chemistry.

[53]  Michael I. Jordan,et al.  Variational Bayesian Inference with Stochastic Search , 2012, ICML.

[54]  Arthur J. Olson,et al.  AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading , 2009, J. Comput. Chem..

[55]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[56]  Thomas P. Minka,et al.  Divergence measures and message passing , 2005 .

[57]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[58]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[59]  Thomas L. Griffiths,et al.  Hierarchical Topic Models and the Nested Chinese Restaurant Process , 2003, NIPS.

[60]  Lex Weaver,et al.  The Optimal Reward Baseline for Gradient-Based Reinforcement Learning , 2001, UAI.

[61]  Tom Minka,et al.  Expectation Propagation for approximate Bayesian inference , 2001, UAI.

[62]  Michael I. Jordan,et al.  Mean Field Theory for Sigmoid Belief Networks , 1996, J. Artif. Intell. Res..

[63]  Geoffrey E. Hinton,et al.  The "wake-sleep" algorithm for unsupervised neural networks. , 1995, Science.

[64]  David Heckerman,et al.  Learning Gaussian Networks , 1994, UAI.