Differentiable Antithetic Sampling for Variance Reduction in Stochastic Variational Inference

Stochastic optimization techniques are standard in variational inference algorithms. These methods estimate gradients by approximating expectations with independent Monte Carlo samples. In this paper, we explore a technique that uses correlated, but more representative , samples to reduce estimator variance. Specifically, we show how to generate antithetic samples that match sample moments with the true moments of an underlying importance distribution. Combining a differentiable antithetic sampler with modern stochastic variational inference, we showcase the effectiveness of this approach for learning a deep generative model.

[1]  Roland Vollgraf,et al.  Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[2]  Luc Devroye Random variate generation in one line of code , 1996, Winter Simulation Conference.

[3]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[4]  Karol Gregor,et al.  Neural Variational Inference and Learning in Belief Networks , 2014, ICML.

[5]  Ruslan Salakhutdinov,et al.  Importance Weighted Autoencoders , 2015, ICLR.

[6]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[7]  Max Welling,et al.  Improved Variational Inference with Inverse Autoregressive Flow , 2016, NIPS 2016.

[8]  Hugo Larochelle,et al.  The Neural Autoregressive Distribution Estimator , 2011, AISTATS.

[9]  William Kruskal,et al.  Helmert's Distribution , 1946 .

[10]  R. Fisher The Advanced Theory of Statistics , 1943, Nature.

[11]  George Marsaglia,et al.  C69. Generating a normal sample with given sample mean and variance , 1980 .

[12]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[13]  Luisa Canal,et al.  A normal approximation for the chi-square distribution , 2005, Comput. Stat. Data Anal..

[14]  Stefano Ermon,et al.  Training Variational Autoencoders with Buffered Stochastic Variational Inference , 2019, AISTATS.

[15]  Ben Poole,et al.  Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.

[16]  Russell C. H. Cheng Generation of Inverse Gaussian Variates with Given Sample Mean and Dispersion , 1984 .

[17]  Lex Weaver,et al.  The Optimal Reward Baseline for Gradient-Based Reinforcement Learning , 2001, UAI.

[18]  Douglas M. Hawkins,et al.  A Note on the Transformation of Chi-Squared Variables to Normality , 1986 .

[19]  Shakir Mohamed,et al.  Variational Inference with Normalizing Flows , 2015, ICML.

[20]  Stefano Ermon,et al.  Deterministic Policy Optimization by Combining Pathwise and Score Function Estimators for Discrete Action Spaces , 2018, AAAI.

[21]  Joshua B. Tenenbaum,et al.  Human-level concept learning through probabilistic program induction , 2015, Science.

[22]  Nando de Freitas,et al.  Inductive Principles for Restricted Boltzmann Machine Learning , 2010, AISTATS.

[23]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[24]  D. Pullin,et al.  Generation of normal variates with given sample mean and variance , 1979 .

[25]  Pieter Abbeel,et al.  Gradient Estimation Using Stochastic Computation Graphs , 2015, NIPS.

[26]  Sean Gerrish,et al.  Black Box Variational Inference , 2013, AISTATS.

[27]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[28]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[29]  Jörg-Rüdiger Sack,et al.  On Generating Random Intervals and Hyperrectangles , 1993 .

[30]  Max Welling,et al.  Improving Variational Auto-Encoders using Householder Flow , 2016, ArXiv.

[31]  Italo Pegoraro,et al.  A Transformation Characterizing the Normal Distribution , 2012 .

[32]  Guy Lever,et al.  Deterministic Policy Gradient Algorithms , 2014, ICML.

[33]  Max Welling,et al.  VAE with a VampPrior , 2017, AISTATS.

[34]  E. B. Wilson,et al.  The Distribution of Chi-Square. , 1931, Proceedings of the National Academy of Sciences of the United States of America.