Neural network architecture search with differentiable cartesian genetic programming for regression

While optimized neural network architectures are essential for effective training with gradient descent, their development remains a challenging and resource-intensive process full of trial-and-error iterations. We propose to encode neural networks with a differentiable variant of Cartesian Genetic Programming (dCGPANN) and present a memetic algorithm for architecture design: local searches with gradient descent learn the network parameters while evolutionary operators act on the dCGPANN genes shaping the network architecture towards faster learning. Studying a particular instance of such a learning scheme, we are able to improve the starting feed forward topology by learning how to rewire and prune links, adapt activation functions and introduce skip connections for chosen regression tasks. The evolved network architectures require less space for network parameters and reach, given the same amount of time, a significantly lower error on average.

[1]  Sergio Escalera,et al.  A brief Review of the ChaLearn AutoML Challenge: Any-time Any-dataset Learning without Human Intervention , 2016, AutoML@ICML.

[2]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[3]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[5]  Masanori Suganuma,et al.  A genetic programming approach to designing convolutional neural network architectures , 2017, GECCO.

[6]  Ramesh Raskar,et al.  Designing Neural Network Architectures using Reinforcement Learning , 2016, ICLR.

[7]  Quoc V. Le,et al.  Large-Scale Evolution of Image Classifiers , 2017, ICML.

[8]  Julian Francis Miller,et al.  Cartesian genetic programming encoded artificial neural networks: a comparison using three benchmarks , 2013, GECCO '13.

[9]  Yong Yu,et al.  Efficient Architecture Search by Network Transformation , 2017, AAAI.

[10]  Julian Francis Miller,et al.  Evolutionary Art with Cartesian Genetic Programming , 2003 .

[11]  Ricardo Vilalta,et al.  A Perspective View and Survey of Meta-Learning , 2002, Artificial Intelligence Review.

[12]  Song Han,et al.  Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.

[13]  Sebastian Risi,et al.  Born to Learn: the Inspiration, Progress, and Future of Evolved Plastic Artificial Neural Networks , 2017, Neural Networks.

[14]  Randal S. Olson,et al.  PMLB: a large benchmark suite for machine learning evaluation and comparison , 2017, BioData Mining.

[15]  Risto Miikkulainen,et al.  Evolving Neural Networks through Augmenting Topologies , 2002, Evolutionary Computation.

[16]  Jürgen Schmidhuber,et al.  Highway Networks , 2015, ArXiv.

[17]  Oliver Schütze,et al.  A Local Search Approach to Genetic Programming for Binary Classification , 2015, GECCO.

[18]  Bogdan Draganski,et al.  Neuroplasticity: Changes in grey matter induced by training , 2004, Nature.

[19]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[20]  A. Topchy,et al.  Faster genetic programming based on local gradient search of numeric leaf values , 2001 .

[21]  Quoc V. Le,et al.  Searching for Activation Functions , 2018, arXiv.

[22]  Václav Snásel,et al.  Metaheuristic design of feedforward neural networks: A review of two decades of research , 2017, Eng. Appl. Artif. Intell..

[23]  W. Singer,et al.  The effects of early visual experience on the cat's visual cortex and their possible explanation by Hebb synapses. , 1981, The Journal of physiology.

[24]  Julian Francis Miller,et al.  Evolution of Robot Controller Using Cartesian Genetic Programming , 2005, EuroGP.

[25]  Julian Francis Miller,et al.  Evolution of Digital Filters Using a Gate Array Model , 1999, EvoWorkshops.

[26]  D. Hassabis,et al.  Neuroscience-Inspired Artificial Intelligence , 2017, Neuron.

[27]  Gaofeng Meng,et al.  EAT-NAS: elastic architecture transfer for accelerating large-scale neural architecture search , 2019, Science China Information Sciences.

[28]  Julian Francis Miller Cartesian Genetic Programming , 2011, Cartesian Genetic Programming.

[29]  Pablo Moscato,et al.  Memetic Algorithms , 2007, Handbook of Approximation Algorithms and Metaheuristics.

[30]  Gul Muhammad Khan,et al.  Fast learning neural networks using Cartesian genetic programming , 2013, Neurocomputing.

[31]  Dario Izzo,et al.  Differentiable Genetic Programming , 2016, EuroGP.

[32]  Elliot Meyerson,et al.  Evolving Deep Neural Networks , 2017, Artificial Intelligence in the Age of Neural Networks and Brain Computing.

[33]  Sepp Hochreiter,et al.  The Vanishing Gradient Problem During Learning Recurrent Neural Nets and Problem Solutions , 1998, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[34]  Peter M. Roth,et al.  The Quest for the Golden Activation Function , 2018, ArXiv.

[35]  Kenneth O. Stanley,et al.  A Hypercube-Based Encoding for Evolving Large-Scale Neural Networks , 2009, Artificial Life.

[36]  Zdenek Vasícek Cartesian GP in Optimization of Combinational Circuits with Hundreds of Inputs and Thousands of Gates , 2015, EuroGP.

[37]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).