Trust-Region Variational Inference with Gaussian Mixture Models

Many methods for machine learning rely on approximate inference from intractable probability distributions. Variational inference approximates such distributions by tractable models that can be subsequently used for approximate inference. Learning sufficiently accurate approximations requires a rich model family and careful exploration of the relevant modes of the target distribution. We propose a method for learning accurate GMM approximations of intractable probability distributions based on insights from policy search by establishing information-geometric trust regions for principled exploration. For efficient improvement of the GMM approximation, we derive a lower bound on the corresponding optimization objective enabling us to update the components independently. The use of the lower bound ensures convergence to a local optimum of the original objective. The number of components is adapted online by adding new components in promising regions and by deleting components with negligible weight. We demonstrate on several domains that we can learn approximations of complex, multi-modal distributions with a quality that is unmet by previous variational inference methods, and that the GMM approximation can be used for drawing samples that are on par with samples created by state-of-the-art MCMC samplers while requiring up to three orders of magnitude less computational resources.

[1]  Rémi Coulom,et al.  Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search , 2006, Computers and Games.

[2]  J. Nocedal,et al.  A Limited Memory Algorithm for Bound Constrained Optimization , 1995, SIAM J. Sci. Comput..

[3]  B. Goodwin Oscillatory behavior in enzymatic control processes. , 1965, Advances in enzyme regulation.

[4]  Shakir Mohamed,et al.  Variational Inference with Normalizing Flows , 2015, ICML.

[5]  David Duvenaud,et al.  FFJORD: Free-form Continuous Dynamics for Scalable Reversible Generative Models , 2018, ICLR.

[6]  Carsten Peterson,et al.  Explorations of the mean field theory learning algorithm , 1989, Neural Networks.

[7]  Jan Peters,et al.  Hierarchical Relative Entropy Policy Search , 2014, AISTATS.

[8]  Ryan P. Adams,et al.  Elliptical slice sampling , 2009, AISTATS.

[9]  Andrew G. Barto,et al.  Reinforcement learning , 1998 .

[10]  Bernhard Schölkopf,et al.  A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..

[11]  Mark W. Schmidt,et al.  Faster Stochastic Variational Inference using Proximal-Gradient Methods with General Divergence Functions , 2015, UAI.

[12]  Nando de Freitas,et al.  An Introduction to Sequential Monte Carlo Methods , 2001, Sequential Monte Carlo Methods in Practice.

[13]  Mark A. Girolami,et al.  Estimating Bayes factors via thermodynamic integration and population MCMC , 2009, Comput. Stat. Data Anal..

[14]  Sergey Levine,et al.  Guided Policy Search , 2013, ICML.

[15]  Tim Salimans,et al.  Fixed-Form Variational Posterior Approximation through Stochastic Linear Regression , 2012, ArXiv.

[16]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[17]  Max Welling,et al.  Improved Variational Inference with Inverse Autoregressive Flow , 2016, NIPS 2016.

[18]  Ryan P. Adams,et al.  Variational Boosting: Iteratively Refining Posterior Approximations , 2016, ICML.

[19]  Dustin Tran,et al.  Variational Gaussian Process , 2015, ICLR.

[20]  Luís Paulo Reis,et al.  Deriving and improving CMA-ES with information geometric trust regions , 2017, GECCO.

[21]  Jan Peters,et al.  A Survey on Policy Search for Robotics , 2013, Found. Trends Robotics.

[22]  David M. Blei,et al.  Nonparametric variational inference , 2012, ICML.

[23]  Youhei Akimoto,et al.  Sample Reuse in the Covariance Matrix Adaptation Evolution Strategy Based on Importance Sampling , 2015, GECCO.

[24]  Brian D. Ziebart,et al.  Robust Covariate Shift Regression , 2016, AISTATS.

[25]  Andrew Gelman,et al.  The No-U-turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo , 2011, J. Mach. Learn. Res..

[26]  Yasemin Altun,et al.  Relative Entropy Policy Search , 2010 .

[27]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[28]  Luís Paulo Reis,et al.  Model-Based Relative Entropy Stochastic Search , 2016, NIPS.

[29]  Eiji Uchibe,et al.  Efficient sample reuse in policy search by multiple importance sampling , 2018, GECCO.

[30]  Michael I. Jordan,et al.  Improving the Mean Field Approximation Via the Use of Mixture Distributions , 1999, Learning in Graphical Models.

[31]  Michael I. Jordan,et al.  Mean Field Theory for Sigmoid Belief Networks , 1996, J. Artif. Intell. Res..

[32]  G. Roberts,et al.  Langevin Diffusions and Metropolis-Hastings Algorithms , 2002 .

[33]  Michael I. Jordan,et al.  Fast Black-box Variational Inference through Stochastic Trust-Region Optimization , 2017, NIPS.

[34]  David Silver,et al.  Reinforced Variational Inference , 2015, NIPS 2015.

[35]  Vicenç Gómez,et al.  A unified view of entropy-regularized Markov decision processes , 2017, ArXiv.

[36]  Sean Gerrish,et al.  Black Box Variational Inference , 2013, AISTATS.

[37]  Yoshua Bengio,et al.  NICE: Non-linear Independent Components Estimation , 2014, ICLR.

[38]  O. Zobay Variational Bayesian inference with Gaussian-mixture approximations , 2014 .

[39]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[40]  Mingjun Zhong,et al.  Efficient Gradient-Free Variational Inference using Policy Search , 2018, ICML.

[41]  Xiangyu Wang,et al.  Boosting Variational Inference , 2016, ArXiv.

[42]  Hugo Larochelle,et al.  MADE: Masked Autoencoder for Distribution Estimation , 2015, ICML.

[43]  David Barber,et al.  An Auxiliary Variational Method , 2004, ICONIP.

[44]  David M. Blei,et al.  Proximity Variational Inference , 2017, AISTATS.

[45]  Ben Calderhead,et al.  A general construction for parallelizing Metropolis−Hastings algorithms , 2014, Proceedings of the National Academy of Sciences.

[46]  Benjamin Recht,et al.  Simple random search of static linear policies is competitive for reinforcement learning , 2018, NeurIPS.

[47]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[48]  Jun S. Liu,et al.  Sequential Imputations and Bayesian Missing Data Problems , 1994 .

[49]  Richard S. Sutton,et al.  Dimensions of Reinforcement Learning , 1998 .

[50]  Sergey Levine,et al.  Trust Region Policy Optimization , 2015, ICML.

[51]  James T. Kwok,et al.  Fast Second Order Stochastic Backpropagation for Variational Inference , 2015, NIPS.

[52]  David M. Blei,et al.  Variational Inference: A Review for Statisticians , 2016, ArXiv.

[53]  Michael W Deem,et al.  Parallel tempering: theory, applications, and new perspectives. , 2005, Physical chemistry chemical physics : PCCP.

[54]  Long-Ji Lin,et al.  Reinforcement learning for robots using neural networks , 1992 .

[55]  Prafulla Dhariwal,et al.  Glow: Generative Flow with Invertible 1x1 Convolutions , 2018, NeurIPS.

[56]  Jan Peters,et al.  Layered direct policy search for learning hierarchical skills , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[57]  Samy Bengio,et al.  Density estimation using Real NVP , 2016, ICLR.

[58]  Ole Winther,et al.  Auxiliary Deep Generative Models , 2016, ICML.

[59]  Radford M. Neal Sampling from multimodal distributions using tempered transitions , 1996, Stat. Comput..

[60]  Iain Murray,et al.  Masked Autoregressive Flow for Density Estimation , 2017, NIPS.

[61]  Jan Peters,et al.  Model-Free Trajectory-based Policy Optimization with Monotonic Improvement , 2016, J. Mach. Learn. Res..

[62]  Matthew D. Hoffman,et al.  A trust-region method for stochastic variational inference with applications to streaming data , 2015, ICML.

[63]  Radford M. Neal Slice Sampling , 2003, The Annals of Statistics.

[64]  Chong Wang,et al.  Stochastic variational inference , 2012, J. Mach. Learn. Res..

[65]  A. Kennedy,et al.  Hybrid Monte Carlo , 1988 .

[66]  Geoffrey E. Hinton,et al.  A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.

[67]  Ryan P. Adams,et al.  Parallel MCMC with generalized elliptical slice sampling , 2012, J. Mach. Learn. Res..

[68]  Pascal Fua,et al.  Kullback-Leibler Proximal Variational Inference , 2015, NIPS.

[69]  Alexandre Lacoste,et al.  Neural Autoregressive Flows , 2018, ICML.

[70]  Xi Chen,et al.  Evolution Strategies as a Scalable Alternative to Reinforcement Learning , 2017, ArXiv.

[71]  Dilin Wang,et al.  Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm , 2016, NIPS.

[72]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[73]  Dustin Tran,et al.  Hierarchical Variational Models , 2015, ICML.

[74]  Neil D. Lawrence,et al.  Approximating Posterior Distributions in Belief Networks Using Mixtures , 1997, NIPS.