Approximation capabilities of measure-preserving neural networks

Measure-preserving neural networks are well-developed invertible models, however, their approximation capabilities remain unexplored. This paper rigorously analyzes the approximation capabilities of existing measure-preserving neural networks including NICE and RevNets. It is shown that for compact U⊂RD with D≥2, the measure-preserving neural networks are able to approximate arbitrary measure-preserving map ψ:U→RD which is bounded and injective in the Lp-norm. In particular, any continuously differentiable injective map with ±1 determinant of Jacobian is measure-preserving, thus can be approximated.

[1]  Andrew L. Maas Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .

[2]  George Em Karniadakis,et al.  SympNets: Intrinsic structure-preserving symplectic networks for identifying Hamiltonian systems , 2020, Neural Networks.

[3]  David Duvenaud,et al.  Neural Ordinary Differential Equations , 2018, NeurIPS.

[4]  Y. Brenier Polar Factorization and Monotone Rearrangement of Vector-Valued Functions , 1991 .

[5]  Haizhao Yang,et al.  Neural Network Approximation: Three Hidden Layers Are Enough , 2020, Neural Networks.

[6]  Feng Kang,et al.  Volume-preserving algorithms for source-free dynamical systems , 1995 .

[7]  Jason Yosinski,et al.  Hamiltonian Neural Networks , 2019, NeurIPS.

[8]  Kamalika Chaudhuri,et al.  The Expressive Power of a Class of Normalizing Flow Models , 2020, AISTATS.

[9]  Yee Whye Teh,et al.  Augmented Neural ODEs , 2019, NeurIPS.

[10]  Simone G. O. Fiori,et al.  Extended Hamiltonian Learning on Riemannian Manifolds: Numerical Aspects , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[11]  David Duvenaud,et al.  Residual Flows for Invertible Generative Modeling , 2019, NeurIPS.

[12]  E Weinan,et al.  A mean-field optimal control formulation of deep learning , 2018, Research in the Mathematical Sciences.

[13]  Prafulla Dhariwal,et al.  Glow: Generative Flow with Invertible 1x1 Convolutions , 2018, NeurIPS.

[14]  Yann Brenier,et al.  $L^p$ Approximation of maps by diffeomorphisms , 2003 .

[15]  Zhenguo Li,et al.  iVPF: Numerical Invertible Volume Preserving Flow for Efficient Lossless Compression , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Alexandre Lacoste,et al.  Neural Autoregressive Flows , 2018, ICML.

[17]  Zhen Zhang,et al.  Learning Poisson systems and trajectories of autonomous systems via Poisson neural networks , 2020, ArXiv.

[18]  Kurt Hornik,et al.  Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks , 1990, Neural Networks.

[19]  Shakir Mohamed,et al.  Variational Inference with Normalizing Flows , 2015, ICML.

[20]  Samy Bengio,et al.  Density estimation using Real NVP , 2016, ICLR.

[21]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[22]  Léon Bottou,et al.  The Tradeoffs of Large Scale Learning , 2007, NIPS.

[23]  Yoshua Bengio,et al.  NICE: Non-linear Independent Components Estimation , 2014, ICLR.

[24]  E Weinan,et al.  A Proposal on Machine Learning via Dynamical Systems , 2017, Communications in Mathematics and Statistics.

[25]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[26]  Zhiping Mao,et al.  DeepXDE: A Deep Learning Library for Solving Differential Equations , 2019, AAAI Spring Symposium: MLPS.

[27]  W. E A Proposal on Machine Learning via Dynamical Systems , 2017 .

[28]  Léon Bottou,et al.  Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.

[29]  George Em Karniadakis,et al.  Quantifying the generalization error in deep learning in terms of data distribution and neural network smoothness , 2019, Neural Networks.

[30]  Zuowei Shen,et al.  Deep Learning via Dynamical Systems: An Approximation Perspective , 2019, Journal of the European Mathematical Society.

[31]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..

[32]  Simone G. O. Fiori,et al.  Extended Hamiltonian Learning on Riemannian Manifolds: Theoretical Aspects , 2011, IEEE Transactions on Neural Networks.

[33]  Ernst Hairer,et al.  Solving Ordinary Differential Equations I: Nonstiff Problems , 2009 .

[34]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Molei Tao,et al.  Data-driven Prediction of General Hamiltonian Dynamics via Learning Exactly-Symplectic Maps , 2021, ICML.

[36]  George Em Karniadakis,et al.  Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators , 2019, Nature Machine Intelligence.

[37]  D. Turaev Polynomial approximations of symplectic dynamics and richness of chaos in non-hyperbolic area-preserving maps , 2003 .

[38]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[39]  Raquel Urtasun,et al.  The Reversible Residual Network: Backpropagation Without Storing Activations , 2017, NIPS.

[40]  F. Krogh,et al.  Solving Ordinary Differential Equations , 2019, Programming for Computations - Python.

[41]  Long Chen,et al.  Maximum Principle Based Algorithms for Deep Learning , 2017, J. Mach. Learn. Res..

[42]  David Duvenaud,et al.  Invertible Residual Networks , 2018, ICML.

[43]  E. Hairer,et al.  Geometric Numerical Integration: Structure Preserving Algorithms for Ordinary Differential Equations , 2004 .