ANODEV2: A Coupled Neural ODE Evolution Framework

It has been observed that residual networks can be viewed as the explicit Euler discretization of an Ordinary Differential Equation (ODE). This observation motivated the introduction of so-called Neural ODEs, which allow more general discretization schemes with adaptive time stepping. Here, we propose ANODEV2, which is an extension of this approach that also allows evolution of the neural network parameters, in a coupled ODE-based formulation. The Neural ODE method introduced earlier is in fact a special case of this new more general framework. We present the formulation of ANODEV2, derive optimality conditions, and implement a coupled reaction-diffusion-advection version of this framework in PyTorch. We present empirical results using several different configurations of ANODEV2, testing them on multiple models on CIFAR-10. We report results showing that this coupled ODE-based framework is indeed trainable, and that it achieves higher accuracy, as compared to the baseline models as well as the recently-proposed Neural ODE approach.

[1]  Peter J. Bentley,et al.  Three Ways to Grow Designs: A Comparison of Embryogenies for an Evolutionary Design Problem , 1999, GECCO.

[2]  A. M. Turing,et al.  The chemical basis of morphogenesis , 1952, Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences.

[3]  Jürgen Schmidhuber,et al.  Evolving neural networks in compressed weight space , 2010, GECCO '10.

[4]  Eldad Haber,et al.  Stable architectures for deep neural networks , 2017, ArXiv.

[5]  Jürgen Schmidhuber,et al.  A ‘Self-Referential’ Weight Matrix , 1993 .

[6]  A. Lindenmayer Mathematical models for cellular interactions in development. I. Filaments with one-sided inputs. , 1968, Journal of theoretical biology.

[7]  Eldad Haber,et al.  Deep Neural Networks Motivated by Partial Differential Equations , 2018, Journal of Mathematical Imaging and Vision.

[8]  Richard K. Belew,et al.  Evolving Aesthetic Sorting Networks Using Developmental Grammars , 1993, ICGA.

[9]  E Weinan,et al.  A Proposal on Machine Learning via Dynamical Systems , 2017, Communications in Mathematics and Statistics.

[10]  Bin Dong,et al.  Beyond Finite Layer Neural Networks: Bridging Deep Architectures and Numerical Differential Equations , 2017, ICML.

[11]  Christian Osendorfer,et al.  NAIS-Net: Stable Deep Networks from Non-Autonomous Differential Equations , 2018, NeurIPS.

[12]  Maja J. Matarić,et al.  A Developmental Model for the Evolution of Complete Autonomous Agents , 1996 .

[13]  Kurt Keutzer,et al.  ANODE: Unconditionally Accurate Memory-Efficient Gradients for Neural ODEs , 2019, IJCAI.

[14]  David Pfau,et al.  Convolution by Evolution: Differentiable Pattern Producing Networks , 2016, GECCO.

[15]  Kenneth O. Stanley,et al.  A Hypercube-Based Encoding for Evolving Large-Scale Neural Networks , 2009, Artificial Life.

[16]  Quoc V. Le,et al.  HyperNetworks , 2016, ICLR.

[17]  Peter Eggenberger-Hotz Evolving Morphologies of Simulated 3d Organisms Based on Differential Gene Expression , 2007 .

[18]  R. Beer,et al.  20 – A developmental model for the evolution of complete autonomous agents , 2003 .

[19]  J. Urgen Schmidhuber Learning to Control Fast-weight Memories: an Alternative to Dynamic Recurrent Networks , 1991 .

[20]  Kenneth O. Stanley,et al.  Compositional Pattern Producing Networks : A Novel Abstraction of Development , 2007 .

[21]  Jordan B. Pollack,et al.  Creating High-Level Components with a Generative Representation for Body-Brain Evolution , 2002, Artificial Life.

[22]  Kenneth O. Stanley Exploiting Regularity Without Development , 2006, AAAI Fall Symposium: Developmental Systems.

[23]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  David Duvenaud,et al.  Neural Ordinary Differential Equations , 2018, NeurIPS.

[25]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.