Residual Networks as Flows of Diffeomorphisms

This paper addresses the understanding and characterization of residual networks (ResNet), which are among the state-of-the-art deep learning architectures for a variety of supervised learning problems. We focus on the mapping component of ResNets, which map the embedding space toward a new unknown space where the prediction or classification can be stated according to linear criteria. We show that this mapping component can be regarded as the numerical implementation of continuous flows of diffeomorphisms governed by ordinary differential equations. In particular, ResNets with shared weights are fully characterized as numerical approximation of exponential diffeomorphic operators. We stress both theoretically and numerically the relevance of the enforcement of diffeomorphic properties and the importance of numerical issues to make consistent the continuous formulation and the discretized ResNet implementation. We further discuss the resulting theoretical and computational insights into ResNet architectures.

[1]  Arnold W. M. Smeulders,et al.  i-RevNet: Deep Invertible Networks , 2018, ICLR.

[2]  Nicholas Ayache,et al.  A Log-Euclidean Framework for Statistics on Diffeomorphisms , 2006, MICCAI.

[3]  Tomaso A. Poggio,et al.  Bridging the Gaps Between Residual Learning, Recurrent Neural Networks and Visual Cortex , 2016, ArXiv.

[4]  Asok Ray,et al.  Principles of Riemannian Geometry in Neural Networks , 2017, NIPS.

[5]  Eldad Haber,et al.  Stable architectures for deep neural networks , 2017, ArXiv.

[6]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[7]  Julien Mairal,et al.  Group Invariance, Stability to Deformations, and Complexity of Deep Convolutional Representations , 2017, J. Mach. Learn. Res..

[8]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Clément Bouttier,et al.  Convergence Rate of a Simulated Annealing Algorithm with Noisy Observations , 2017, J. Mach. Learn. Res..

[10]  Vincent Arsigny,et al.  Processing Data in Lie Groups : An Algebraic Approach. Application to Non-Linear Registration and Diffusion Tensor MRI. (Traitement de données dans les groupes de Lie : une approche algébrique. Application au recalage non-linéaire et à l'imagerie du tenseur de diffusion) , 2006 .

[11]  Philip Rabinowitz,et al.  Methods of Numerical Integration , 1985 .

[12]  Laurent Younes,et al.  Diffeomorphic Learning , 2018, ArXiv.

[13]  Eldad Haber,et al.  Deep Neural Networks Motivated by Partial Differential Equations , 2018, Journal of Mathematical Imaging and Vision.

[14]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[15]  Jean-Philippe Thirion,et al.  Image matching as a diffusion process: an analogy with Maxwell's demons , 1998, Medical Image Anal..

[16]  Bernhard Pfahringer,et al.  Regularisation of neural networks by enforcing Lipschitz continuity , 2018, Machine Learning.

[17]  Alexander J. Smola,et al.  Learning with Kernels: support vector machines, regularization, optimization, and beyond , 2001, Adaptive computation and machine learning series.

[18]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[19]  Alain Trouvé,et al.  Diffeomorphisms Groups and Pattern Matching in Image Analysis , 1998, International Journal of Computer Vision.

[20]  Raquel Urtasun,et al.  The Reversible Residual Network: Backpropagation Without Storing Activations , 2017, NIPS.

[21]  Nikos Paragios,et al.  Deformable Medical Image Registration: A Survey , 2013, IEEE Transactions on Medical Imaging.

[22]  Kyoung Mu Lee,et al.  Accurate Image Super-Resolution Using Very Deep Convolutional Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Kyoung Mu Lee,et al.  Deeply-Recursive Convolutional Network for Image Super-Resolution , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[25]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[26]  E Weinan,et al.  A Proposal on Machine Learning via Dynamical Systems , 2017, Communications in Mathematics and Statistics.

[27]  J. Gee,et al.  Geodesic estimation for large deformation anatomical shape averaging and interpolation , 2004, NeuroImage.

[28]  L. Younes Shapes and Diffeomorphisms , 2010 .

[29]  John Ashburner,et al.  A fast diffeomorphic image registration algorithm , 2007, NeuroImage.

[30]  Stéphane Mallat,et al.  Understanding deep convolutional networks , 2016, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[31]  Arno Klein,et al.  Evaluation of 14 nonlinear deformation algorithms applied to human brain MRI registration , 2009, NeuroImage.

[32]  Alain Trouvé,et al.  Computing Large Deformation Metric Mappings via Geodesic Flows of Diffeomorphisms , 2005, International Journal of Computer Vision.

[33]  Zhuowen Tu,et al.  Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).