Invertible DenseNets with Concatenated LipSwish

We introduce Invertible Dense Networks (iDenseNets), a more parameter efficient alternative to Residual Flows. The method relies on an analysis of the Lipschitz continuity of the concatenation in DenseNets, where we enforce invertibility of the network by satisfying the Lipschitz constant. We extend this method by proposing a learnable concatenation, which not only improves the model performance but also indicates the importance of the concatenated representation. Additionally, we introduce the Concatenated LipSwish as activation function, for which we show how to enforce the Lipschitz condition and which boosts performance. The new architecture, i-DenseNet, out-performs Residual Flow and other flow-based models on density estimation evaluated in bits per dimension, where we utilize an equal parameter budget. Moreover, we show that the proposed model out-performs Residual Flows when trained as a hybrid model where the model is both a generative and a discriminative model.

[1]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[2]  Prafulla Dhariwal,et al.  Glow: Generative Flow with Invertible 1x1 Convolutions , 2018, NeurIPS.

[3]  Matthias Bethge,et al.  Excessive Invariance Causes Adversarial Vulnerability , 2018, ICLR.

[4]  Honglak Lee,et al.  Understanding and Improving Convolutional Neural Networks via Concatenated Rectified Linear Units , 2016, ICML.

[5]  Shakir Mohamed,et al.  Variational Inference with Normalizing Flows , 2015, ICML.

[6]  Samy Bengio,et al.  Density estimation using Real NVP , 2016, ICLR.

[7]  E. Tabak,et al.  A Family of Nonparametric Density Estimation Algorithms , 2013 .

[8]  Nicola De Cao,et al.  Block Neural Autoregressive Flow , 2019, UAI.

[9]  Audra E. Kosh,et al.  Linear Algebra and its Applications , 1992 .

[10]  Yuichi Yoshida,et al.  Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[11]  D. M. Titterington,et al.  Statistics and Neural Networks , 2000, Technometrics.

[12]  David Duvenaud,et al.  FFJORD: Free-form Continuous Dynamics for Scalable Reversible Generative Models , 2018, ICLR.

[13]  Cem Anil,et al.  Preventing Gradient Attenuation in Lipschitz Constrained Convolutional Networks , 2019, NeurIPS.

[14]  David Duvenaud,et al.  Residual Flows for Invertible Generative Modeling , 2019, NeurIPS.

[15]  Max Welling,et al.  Improved Variational Inference with Inverse Autoregressive Flow , 2016, NIPS 2016.

[16]  Iain Murray,et al.  Masked Autoregressive Flow for Density Estimation , 2017, NIPS.

[17]  Matthias Bethge,et al.  A note on the evaluation of generative models , 2015, ICLR.

[18]  M. Hutchinson A stochastic estimator of the trace of the influence matrix for laplacian smoothing splines , 1989 .

[19]  David Duvenaud,et al.  Invertible Residual Networks , 2018, ICML.

[20]  Heiga Zen,et al.  WaveNet: A Generative Model for Raw Audio , 2016, SSW.

[21]  Emiel Hoogeboom,et al.  Learning Discrete Distributions by Dequantization , 2020, ArXiv.

[22]  Bernhard Pfahringer,et al.  Regularisation of neural networks by enforcing Lipschitz continuity , 2018, Machine Learning.

[23]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[24]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Max Welling,et al.  Sylvester Normalizing Flows for Variational Inference , 2018, UAI.

[26]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[27]  Koray Kavukcuoglu,et al.  Pixel Recurrent Neural Networks , 2016, ICML.

[28]  Cem Anil,et al.  Sorting out Lipschitz function approximation , 2018, ICML.

[29]  Jakub M. Tomczak,et al.  The Convolution Exponential and Generalized Sylvester Flows , 2020, NeurIPS.

[30]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Yee Whye Teh,et al.  Hybrid Models with Deep and Invertible Features , 2019, ICML.

[32]  Yoshua Bengio,et al.  NICE: Non-linear Independent Components Estimation , 2014, ICLR.

[33]  Max Welling,et al.  Improving Variational Auto-Encoders using Householder Flow , 2016, ArXiv.

[34]  E. Tabak,et al.  DENSITY ESTIMATION BY DUAL ASCENT OF THE LOG-LIKELIHOOD ∗ , 2010 .

[35]  Ryan P. Adams,et al.  High-Dimensional Probability Estimation with Deep Density Models , 2013, ArXiv.

[36]  J. Skilling The Eigenvalues of Mega-dimensional Matrices , 1989 .

[37]  Frank Hutter,et al.  A Downsampled Variant of ImageNet as an Alternative to the CIFAR datasets , 2017, ArXiv.

[38]  Pieter Abbeel,et al.  Flow++: Improving Flow-Based Generative Models with Variational Dequantization and Architecture Design , 2019, ICML.