Go with the Flow: Adaptive Control for Neural ODEs

Despite their elegant formulation and lightweight memory cost, neural ordinary differential equations (NODEs) suffer from known representational limitations. In particular, the single flow learned by NODEs cannot express all homeomorphisms from a given data space to itself, and their static weight parametrization restricts the type of functions they can learn compared to discrete architectures with layer-dependent weights. Here, we describe a new module called neurally-controlled ODE (N-CODE) designed to improve the expressivity of NODEs. The parameters of N-CODE modules are dynamic variables governed by a trainable map from initial or current activation state, resulting in forms of open-loop and closed-loop control, respectively. A single module is sufficient for learning a distribution on non-autonomous flows that adaptively drive neural representations. We provide theoretical and empirical evidence that N-CODE circumvents limitations of previous models and show how increased model expressivity manifests in several domains. In supervised learning, we demonstrate that our framework achieves better performance than NODEs as measured by both training speed and testing accuracy. In unsupervised learning, we apply this control perspective to an image Autoencoder endowed with a latent transformation flow, greatly improving representational power over a vanilla model and leading to state-of-the-art image reconstruction on CIFAR-10.

[1]  Bin Dong,et al.  You Only Propagate Once: Accelerating Adversarial Training via Maximal Principle , 2019, NeurIPS.

[2]  Kurt Keutzer,et al.  ANODEV2: A Coupled Neural ODE Framework , 2019, NeurIPS.

[3]  George J. Pappas,et al.  Robust Deep Learning as Optimal Control: Insights and Convergence Guarantees , 2020, L4DC.

[4]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[5]  Yang Song,et al.  Generative Modeling by Estimating Gradients of the Data Distribution , 2019, NeurIPS.

[6]  Stefanie Jegelka,et al.  ResNet with one-neuron hidden layers is a Universal Approximator , 2018, NeurIPS.

[7]  Carola-Bibiane Schönlieb,et al.  Deep learning as optimal control problems: models and numerical methods , 2019, Journal of Computational Dynamics.

[8]  David Duvenaud,et al.  FFJORD: Free-form Continuous Dynamics for Scalable Reversible Generative Models , 2018, ICLR.

[9]  Jack K. Hale,et al.  Dynamical systems and stability , 1969 .

[10]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[11]  Levon Nurbekyan,et al.  A machine learning framework for solving high-dimensional mean field game and mean field control problems , 2020, Proceedings of the National Academy of Sciences.

[12]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[13]  Bernhard Schölkopf,et al.  From Variational to Deterministic Autoencoders , 2019, ICLR.

[14]  Adam M. Oberman,et al.  How to Train Your Neural ODE: the World of Jacobian and Kinetic Regularization , 2020, ICML.

[15]  Oriol Vinyals,et al.  Learning Implicit Generative Models with the Method of Learned Moments , 2018, ICML.

[16]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Evangelos A. Theodorou,et al.  Deep Learning Theory Review: An Optimal Control and Dynamical Systems Perspective , 2019, ArXiv.

[18]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[19]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[20]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[21]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[22]  E. Blum,et al.  The Mathematical Theory of Optimal Processes. , 1963 .

[23]  Shiyu Chang,et al.  AutoGAN: Neural Architecture Search for Generative Adversarial Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).