Sum-of-Squares Polynomial Flow

Triangular map is a recent construct in probability theory that allows one to transform any source probability density function to any target density function. Based on triangular maps, we propose a general framework for high-dimensional density estimation, by specifying one-dimensional transformations (equivalently conditional densities) and appropriate conditioner networks. This framework (a) reveals the commonalities and differences of existing autoregressive and flow based methods, (b) allows a unified understanding of the limitations and representation power of these recent approaches and, (c) motivates us to uncover a new Sum-of-Squares (SOS) flow that is interpretable, universal, and easy to train. We perform several synthetic experiments on various density geometries to demonstrate the benefits (and short-comings) of such transformations. SOS flows achieve competitive results in simulations and several real-world datasets.

[1]  Benjamin Peherstorfer,et al.  A transport-based multifidelity preconditioner for Markov chain Monte Carlo , 2018, Advances in Computational Mathematics.

[2]  M. Rosenblatt Remarks on a Multivariate Transformation , 1952 .

[3]  Hugo Larochelle,et al.  MADE: Masked Autoencoder for Distribution Estimation , 2015, ICML.

[4]  Arthur Gretton,et al.  Learning deep kernels for exponential family densities , 2018, ICML.

[5]  Barnabás Póczos,et al.  Transformation Autoregressive Networks , 2018, ICML.

[6]  Robin Sibson,et al.  What is projection pursuit , 1987 .

[7]  P. Deb Finite Mixture Models , 2008 .

[8]  Max Welling,et al.  Sylvester Normalizing Flows for Variational Inference , 2018, UAI.

[9]  Marian Neamtu,et al.  Interpolation and Approximation from Convex Sets , 1998 .

[10]  Guillaume Carlier,et al.  From Knothe's Transport to Brenier's Map and a Continuation Method for Optimal Transport , 2008, SIAM J. Math. Anal..

[11]  M. Marshall Positive polynomials and sums of squares , 2008 .

[12]  Shakir Mohamed,et al.  Variational Inference with Normalizing Flows , 2015, ICML.

[13]  A. Norman Redlich,et al.  Supervised Factorial Learning , 1993, Neural Computation.

[14]  Samy Bengio,et al.  Modeling High-Dimensional Discrete Data with Multi-Layer Neural Networks , 1999, NIPS.

[15]  Valero Laparra,et al.  Density Modeling of Images using a Generalized Normalization Transformation , 2015, ICLR.

[16]  Youssef M. Marzouk,et al.  Inference via Low-Dimensional Couplings , 2017, J. Mach. Learn. Res..

[17]  Youssef M. Marzouk,et al.  Bayesian inference with optimal maps , 2011, J. Comput. Phys..

[18]  Valero Laparra,et al.  Iterative Gaussianization: From ICA to Random Rotations , 2011, IEEE Transactions on Neural Networks.

[19]  Prafulla Dhariwal,et al.  Glow: Generative Flow with Invertible 1x1 Convolutions , 2018, NeurIPS.

[20]  Samy Bengio,et al.  Density estimation using Real NVP , 2016, ICLR.

[21]  Allen I. Fleishman A method for simulating non-normal distributions , 1978 .

[22]  C. D. Vale,et al.  Simulating multivariate nonnormal distributions , 1983 .

[23]  M. Talagrand Transportation cost for Gaussian and other product measures , 1996 .

[24]  Iain Murray,et al.  Masked Autoregressive Flow for Density Estimation , 2017, NIPS.

[25]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[26]  Hugo Larochelle,et al.  Neural Autoregressive Distribution Estimation , 2016, J. Mach. Learn. Res..

[27]  J. Friedman,et al.  PROJECTION PURSUIT DENSITY ESTIMATION , 1984 .

[28]  David Duvenaud,et al.  FFJORD: Free-form Continuous Dynamics for Scalable Reversible Generative Models , 2018, ICLR.

[29]  Kirill V. Medvedev CERTAIN PROPERTIES OF TRIANGULAR TRANSFORMATIONS OF MEASURES , 2008 .

[30]  Roger B. Grosse,et al.  Reversible Recurrent Neural Networks , 2018, NeurIPS.

[31]  C. Villani Optimal Transport: Old and New , 2008 .

[32]  Radford M. Neal Connectionist Learning of Belief Networks , 1992, Artif. Intell..

[33]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[34]  Max Welling,et al.  Improved Variational Inference with Inverse Autoregressive Flow , 2016, NIPS 2016.

[35]  V. Bogachev,et al.  Triangular transformations of measures , 2005 .

[36]  Hugo Larochelle,et al.  The Neural Autoregressive Distribution Estimator , 2011, AISTATS.

[37]  Youssef Marzouk,et al.  Transport Map Accelerated Markov Chain Monte Carlo , 2014, SIAM/ASA J. Uncertain. Quantification.

[38]  E. Tabak,et al.  A Family of Nonparametric Density Estimation Algorithms , 2013 .

[39]  Koray Kavukcuoglu,et al.  Pixel Recurrent Neural Networks , 2016, ICML.

[40]  David Duvenaud,et al.  Neural Ordinary Differential Equations , 2018, NeurIPS.

[41]  H. Knothe Contributions to the theory of convex bodies. , 1957 .

[42]  Todd C. Headrick Statistical Simulation: Power Method Polynomials and Other Transformations , 2009 .

[43]  Gustavo Deco,et al.  Nonlinear higher-order statistical decorrelation by volume-conserving neural architectures , 1995, Neural Networks.

[44]  Yoshua Bengio,et al.  NICE: Non-linear Independent Components Estimation , 2014, ICLR.

[45]  E. Tabak,et al.  DENSITY ESTIMATION BY DUAL ASCENT OF THE LOG-LIKELIHOOD ∗ , 2010 .

[46]  D. E. Alexandrova Convergence of Triangular Transformations of Measures , 2006 .

[47]  Rémi Munos,et al.  Autoregressive Quantile Networks for Generative Modeling , 2018, ICML.

[48]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.