A RAD approach to deep mixture models

Flow based models such as Real NVP are an extremely powerful approach to density estimation. However, existing flow based models are restricted to transforming continuous densities over a continuous input space into similarly continuous distributions over continuous latent variables. This makes them poorly suited for modeling and representing discrete structures in data distributions, for example class membership or discrete symmetries. To address this difficulty, we present a normalizing flow architecture which relies on domain partitioning using locally invertible functions, and possesses both real and discrete valued latent variables. This Real and Discrete (RAD) approach retains the desirable normalizing flow properties of exact sampling, exact inference, and analytically computable probabilities, while at the same time allowing simultaneous modeling of both continuous and discrete structure in a data distribution.

[1]  Gerhard Widmer,et al.  Mixture Density Generative Adversarial Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[3]  Kurt Hornik,et al.  Approximation capabilities of multilayer feedforward networks , 1991, Neural Networks.

[4]  Christopher Leckie,et al.  Invertible Generative Modeling using Linear Rational Splines , 2020, AISTATS.

[5]  Stefano Ermon,et al.  Flexible Approximate Inference via Stratified Normalizing Flows , 2020, UAI.

[6]  Reuven Y. Rubinstein,et al.  Simulation and the Monte Carlo method , 1981, Wiley series in probability and mathematical statistics.

[7]  Trevor Darrell,et al.  Discriminator Rejection Sampling , 2018, ICLR.

[8]  Ben Poole,et al.  Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.

[9]  Travis E. Oliphant,et al.  Guide to NumPy , 2015 .

[10]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[11]  Travis E. Oliphant,et al.  Python for Scientific Computing , 2007, Computing in Science & Engineering.

[12]  Razvan Pascanu,et al.  On the Number of Linear Regions of Deep Neural Networks , 2014, NIPS.

[13]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[14]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..

[15]  Andrew L. Maas Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .

[16]  Kumar Krishna Agrawal,et al.  Discrete Flows: Invertible Generative Models of Discrete Data , 2019, DGS@ICLR.

[17]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[18]  Yee Whye Teh,et al.  Inference Trees: Adaptive Inference with Exploration , 2018, 1806.09550.

[19]  Ole Winther,et al.  SurVAE Flows: Surjections to Bridge the Gap between VAEs and Flows , 2020, NeurIPS.

[20]  E. Tabak,et al.  A Family of Nonparametric Density Estimation Algorithms , 2013 .

[21]  Thomas Müller,et al.  Neural Importance Sampling , 2018, ACM Trans. Graph..

[22]  Yoshua Bengio,et al.  A Spike and Slab Restricted Boltzmann Machine , 2011, AISTATS.

[23]  Andriy Mnih,et al.  Resampled Priors for Variational Autoencoders , 2018, AISTATS.

[24]  John D. Hunter,et al.  Matplotlib: A 2D Graphics Environment , 2007, Computing in Science & Engineering.

[25]  Ryan P. Adams,et al.  Composing graphical models with neural networks for structured representations and fast inference , 2016, NIPS.

[26]  Stefano Ermon,et al.  Variational Rejection Sampling , 2018, AISTATS.

[27]  Jascha Sohl-Dickstein,et al.  REBAR: Low-variance, unbiased gradient estimates for discrete latent variable models , 2017, NIPS.

[28]  Allan Pinkus,et al.  Approximation theory of the MLP model in neural networks , 1999, Acta Numerica.

[29]  Geoffrey E. Hinton,et al.  Deep Mixtures of Factor Analysers , 2012, ICML.

[30]  David Duvenaud,et al.  FFJORD: Free-form Continuous Dynamics for Scalable Reversible Generative Models , 2018, ICLR.

[31]  Iain Murray,et al.  Neural Spline Flows , 2019, NeurIPS.

[32]  Ryan P. Adams,et al.  High-Dimensional Probability Estimation with Deep Density Models , 2013, ArXiv.

[33]  Aäron van den Oord,et al.  Locally-connected transformations for deep GMMs , 2015, ICML 2015.

[34]  Yee Whye Teh,et al.  The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.

[35]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[36]  Geoffrey E. Hinton,et al.  A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.

[37]  David Duvenaud,et al.  Neural Ordinary Differential Equations , 2018, NeurIPS.

[38]  Tim Salimans,et al.  IDF++: Analyzing and Improving Integer Discrete Flows for Lossless Compression , 2020, ICLR.

[39]  Yoshua Bengio,et al.  NICE: Non-linear Independent Components Estimation , 2014, ICLR.

[40]  Arnaud Doucet,et al.  Localised Generative Flows , 2019, ArXiv.

[41]  Prafulla Dhariwal,et al.  Glow: Generative Flow with Invertible 1x1 Convolutions , 2018, NeurIPS.

[42]  Samy Bengio,et al.  Density estimation using Real NVP , 2016, ICLR.

[43]  Patrick Forré,et al.  Reparameterizing Distributions on Lie Groups , 2019, AISTATS.

[44]  Benjamin Schrauwen,et al.  Factoring Variations in Natural Images with Deep Gaussian Mixture Models , 2014, NIPS.

[45]  Li Fei-Fei,et al.  Tackling Over-pruning in Variational Autoencoders , 2017, ArXiv.

[46]  Guido Rossum,et al.  Python Reference Manual , 2000 .

[47]  David Duvenaud,et al.  Backpropagation through the Void: Optimizing control variates for black-box gradient estimation , 2017, ICLR.

[48]  Roger B. Grosse,et al.  On the Invertibility of Invertible Neural Networks , 2019 .

[49]  Yair Weiss,et al.  On GANs and GMMs , 2018, NeurIPS.

[50]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[51]  Karol Gregor,et al.  Neural Variational Inference and Learning in Belief Networks , 2014, ICML.

[52]  Gaël Varoquaux,et al.  The NumPy Array: A Structure for Efficient Numerical Computation , 2011, Computing in Science & Engineering.

[53]  Tapani Raiko,et al.  Techniques for Learning Binary Stochastic Feedforward Neural Networks , 2014, ICLR.

[54]  Oriol Vinyals,et al.  Neural Discrete Representation Learning , 2017, NIPS.

[55]  Eric Jones,et al.  SciPy: Open Source Scientific Tools for Python , 2001 .

[56]  Yoshua Bengio,et al.  Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation , 2013, ArXiv.

[57]  Emiel Hoogeboom,et al.  Integer Discrete Flows and Lossless Compression , 2019, NeurIPS.

[58]  L. Baird,et al.  One-step neural network inversion with PDF learning and emulation , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[59]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[60]  Jason Tyler Rolfe,et al.  Discrete Variational Autoencoders , 2016, ICLR.