Relaxing Bijectivity Constraints with Continuously Indexed Normalising Flows

We show that normalising flows become pathological when used to model targets whose supports have complicated topologies. In this scenario, we prove that a flow must become arbitrarily numerically noninvertible in order to approximate the target closely. This result has implications for all flow-based models, and especially Residual Flows (ResFlows), which explicitly control the Lipschitz constant of the bijection used. To address this, we propose Continuously Indexed Flows (CIFs), which replace the single bijection used by normalising flows with a continuously indexed family of bijections, and which can intuitively "clean up" mass that would otherwise be misplaced by a single bijection. We show theoretically that CIFs are not subject to the same topological limitations as normalising flows, and obtain better empirical performance on a variety of models and benchmarks.

[1]  W. Rudin Principles of mathematical analysis , 1964 .

[2]  J. Skilling The Eigenvalues of Mega-dimensional Matrices , 1989 .

[3]  M. Hutchinson A stochastic estimator of the trace of the influence matrix for laplacian smoothing splines , 1989 .

[4]  Jitendra Malik,et al.  A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[5]  David Barber,et al.  An Auxiliary Variational Method , 2004, ICONIP.

[6]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[7]  C. Villani Optimal Transport: Old and New , 2008 .

[8]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[9]  R. Cooke Real and Complex Analysis , 2011 .

[10]  Yves F. Atchad'e,et al.  On Russian Roulette Estimates for Bayesian Inference with Doubly-Intractable Likelihoods , 2013, 1306.4032.

[11]  J. Norris Appendix: probability and measure , 1997 .

[12]  Benjamin Schrauwen,et al.  Factoring Variations in Natural Images with Deep Gaussian Mixture Models , 2014, NIPS.

[13]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[14]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[15]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[16]  Max Welling,et al.  Semi-supervised Learning with Deep Generative Models , 2014, NIPS.

[17]  Shakir Mohamed,et al.  Variational Inference with Normalizing Flows , 2015, ICML.

[18]  Peter W. Glynn,et al.  Unbiased Estimation with Square Root Convergence for SDE Models , 2015, Oper. Res..

[19]  Hugo Larochelle,et al.  MADE: Masked Autoencoder for Distribution Estimation , 2015, ICML.

[20]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[21]  Yoshua Bengio,et al.  NICE: Non-linear Independent Components Estimation , 2014, ICLR.

[22]  Aäron van den Oord,et al.  Locally-connected transformations for deep GMMs , 2015, ICML 2015.

[23]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Matthias Bethge,et al.  A note on the evaluation of generative models , 2015, ICLR.

[25]  Ruslan Salakhutdinov,et al.  Importance Weighted Autoencoders , 2015, ICLR.

[26]  Ole Winther,et al.  Ladder Variational Autoencoders , 2016, NIPS.

[27]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[28]  Samy Bengio,et al.  Density estimation using Real NVP , 2016, ICLR.

[29]  Iain Murray,et al.  Masked Autoregressive Flow for Density Estimation , 2017, NIPS.

[30]  Max Welling,et al.  Improved Variational Inference with Inverse Autoregressive Flow , 2016, NIPS 2016.

[31]  Léon Bottou,et al.  Wasserstein Generative Adversarial Networks , 2017, ICML.

[32]  Roland Vollgraf,et al.  Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[33]  Prafulla Dhariwal,et al.  Glow: Generative Flow with Invertible 1x1 Convolutions , 2018, NeurIPS.

[34]  David Duvenaud,et al.  Neural Ordinary Differential Equations , 2018, NeurIPS.

[35]  Max Welling,et al.  Sylvester Normalizing Flows for Variational Inference , 2018, UAI.

[36]  Yuichi Yoshida,et al.  Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[37]  Alexandre Lacoste,et al.  Neural Autoregressive Flows , 2018, ICML.

[38]  Iain Murray,et al.  Neural Spline Flows , 2019, Neural Information Processing Systems.

[39]  L. Duan Transport Monte Carlo , 2019, 1907.10448.

[40]  Yee Whye Teh,et al.  Augmented Neural ODEs , 2019, NeurIPS.

[41]  Razvan Pascanu,et al.  A RAD approach to deep mixture models , 2019, DGS@ICLR.

[42]  Roger B. Grosse,et al.  On the Invertibility of Invertible Neural Networks , 2019 .

[43]  David Duvenaud,et al.  Invertible Residual Networks , 2018, ICML.

[44]  David Duvenaud,et al.  FFJORD: Free-form Continuous Dynamics for Scalable Reversible Generative Models , 2018, ICLR.

[45]  Ryan P. Adams,et al.  Efficient Optimization of Loops and Limits with Randomized Telescoping Sums , 2019, ICML.

[46]  Yaoliang Yu,et al.  Sum-of-Squares Polynomial Flow , 2019, ICML.

[47]  Pieter Abbeel,et al.  Flow++: Improving Flow-Based Generative Models with Variational Dequantization and Architecture Design , 2019, ICML.

[48]  David Duvenaud,et al.  Residual Flows for Invertible Generative Modeling , 2019, NeurIPS.

[49]  Bernhard Pfahringer,et al.  Regularisation of neural networks by enforcing Lipschitz continuity , 2018, Machine Learning.