The Expressive Power of a Class of Normalizing Flow Models

Normalizing flows have received a great deal of recent attention as they allow flexible generative modeling as well as easy likelihood computation. While a wide variety of flow models have been proposed, there is little formal understanding of the representation power of these models. In this work, we study some basic normalizing flows and rigorously establish bounds on their expressive power. Our results indicate that while these flows are highly expressive in one dimension, in higher dimensions their representation power may be limited, especially when the flows have moderate depth.

[1]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[2]  Yoshua Bengio,et al.  NICE: Non-linear Independent Components Estimation , 2014, ICLR.

[3]  Razvan Pascanu,et al.  On the Number of Linear Regions of Deep Neural Networks , 2014, NIPS.

[4]  Stefanie Jegelka,et al.  ResNet with one-neuron hidden layers is a Universal Approximator , 2018, NeurIPS.

[5]  Boris Hanin,et al.  Universal Function Approximation by Deep Neural Nets with Bounded Width and ReLU Activations , 2017, Mathematics.

[6]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..

[7]  Surya Ganguli,et al.  On the Expressive Power of Deep Neural Networks , 2016, ICML.

[8]  Max Welling,et al.  Improving Variational Auto-Encoders using Householder Flow , 2016, ArXiv.

[9]  Samy Bengio,et al.  Density estimation using Real NVP , 2016, ICLR.

[10]  David Duvenaud,et al.  Residual Flows for Invertible Generative Modeling , 2019, NeurIPS.

[11]  Liwei Wang,et al.  The Expressive Power of Neural Networks: A View from the Width , 2017, NIPS.

[12]  Edward Neuman,et al.  Inequalities and Bounds for the Incomplete Gamma Function , 2013 .

[13]  David Duvenaud,et al.  FFJORD: Free-form Continuous Dynamics for Scalable Reversible Generative Models , 2018, ICLR.

[14]  Nematollah Batmanghelich,et al.  Deep Diffeomorphic Normalizing Flows , 2018, ArXiv.

[15]  J. Sherman,et al.  Adjustment of an Inverse Matrix Corresponding to a Change in One Element of a Given Matrix , 1950 .

[16]  Han Zhang,et al.  Approximation Capabilities of Neural Ordinary Differential Equations , 2019, ArXiv.

[17]  C. Villani Optimal Transport: Old and New , 2008 .

[18]  E Weinan,et al.  Monge-Ampère Flow for Generative Modeling , 2018, ArXiv.

[19]  David Duvenaud,et al.  Neural Ordinary Differential Equations , 2018, NeurIPS.

[20]  Shakir Mohamed,et al.  Variational Inference with Normalizing Flows , 2015, ICML.

[21]  Marian Neamtu,et al.  Interpolation and Approximation from Convex Sets , 1998 .

[22]  David Duvenaud,et al.  Invertible Residual Networks , 2018, ICML.

[23]  Kurt Hornik,et al.  Approximation capabilities of multilayer feedforward networks , 1991, Neural Networks.

[24]  Pieter Abbeel,et al.  Flow++: Improving Flow-Based Generative Models with Variational Dequantization and Architecture Design , 2019, ICML.

[25]  Jakub M. Tomczak,et al.  Variational Inference with Orthogonal Normalizing Flows , 2017 .

[26]  Han Zhang,et al.  Approximation Capabilities of Neural ODEs and Invertible Residual Networks , 2020, ICML.

[27]  Hugo Larochelle,et al.  MADE: Masked Autoencoder for Distribution Estimation , 2015, ICML.

[28]  E. Tabak,et al.  A Family of Nonparametric Density Estimation Algorithms , 2013 .

[29]  Matus Telgarsky,et al.  Size-Noise Tradeoffs in Generative Networks , 2018, NeurIPS.

[30]  Max Welling,et al.  Improved Variational Inference with Inverse Autoregressive Flow , 2016, NIPS 2016.

[31]  Yee Whye Teh,et al.  Augmented Neural ODEs , 2019, NeurIPS.

[32]  Lawrence Carin,et al.  Continuous-Time Flows for Efficient Inference and Density Estimation , 2017, ICML.

[33]  A. Blumberg BASIC TOPOLOGY , 2002 .

[34]  Max Welling,et al.  Sylvester Normalizing Flows for Variational Inference , 2018, UAI.

[35]  H. Luetkepohl The Handbook of Matrices , 1996 .

[36]  Alexandre Lacoste,et al.  Neural Autoregressive Flows , 2018, ICML.

[37]  E. Tabak,et al.  DENSITY ESTIMATION BY DUAL ASCENT OF THE LOG-LIKELIHOOD ∗ , 2010 .

[38]  Hugo Larochelle,et al.  Neural Autoregressive Distribution Estimation , 2016, J. Mach. Learn. Res..

[39]  Yaoliang Yu,et al.  Sum-of-Squares Polynomial Flow , 2019, ICML.

[40]  Matus Telgarsky,et al.  Representation Benefits of Deep Feedforward Networks , 2015, ArXiv.

[41]  Iain Murray,et al.  Masked Autoregressive Flow for Density Estimation , 2017, NIPS.

[42]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).