论文信息 - PathNet: Evolution Channels Gradient Descent in Super Neural Networks - 字舞流文

PathNet: Evolution Channels Gradient Descent in Super Neural Networks

For artificial general intelligence (AGI) it would be efficient if multiple users trained the same giant neural network, permitting parameter reuse, without catastrophic forgetting. PathNet is a first step in this direction. It is a neural network algorithm that uses agents embedded in the neural network whose task is to discover which parts of the network to re-use for new tasks. Agents are pathways (views) through the network which determine the subset of parameters that are used and updated by the forwards and backwards passes of the backpropogation algorithm. During learning, a tournament selection genetic algorithm is used to select pathways through the neural network for replication and mutation. Pathway fitness is the performance of that pathway measured according to a cost function. We demonstrate successful transfer learning; fixing the parameters along a path learned on task A and re-evolving a new population of paths for task B, allows task B to be learned faster than it could be learned from scratch or after fine-tuning. Paths evolved on task B re-use parts of the optimal path evolved on task A. Positive transfer was demonstrated for binary MNIST, CIFAR, and SVHN supervised learning classification tasks, and a set of Atari and Labyrinth reinforcement learning tasks, suggesting PathNets have general applicability for neural network training. Finally, PathNet also significantly improves the robustness to hyperparameter choices of a parallel asynchronous reinforcement learning algorithm (A3C).

Chrisantha Fernando | Daan Wierstra | Charles Blundell | Alexander Pritzel | Yori Zwols | David Ha | Andrei A. Rusu | Dylan Banarse | Daan Wierstra | C. Blundell | A. Pritzel | Yori Zwols | Chrisantha Fernando | D. Banarse | David R Ha

[1] Mark B. Ring. Continual learning in reinforcement environments , 1995, GMD-Bericht.

[2] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[3] Rich Caruana,et al. Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.

[4] Michael J. Frank,et al. Interactions between frontal cortex and basal ganglia in working memory: A computational model , 2001, Cognitive, affective & behavioral neuroscience.

[5] Michael J. Frank,et al. Making Working Memory Work: A Computational Model of Learning in the Prefrontal Cortex and Basal Ganglia , 2006, Neural Computation.

[6] Thomas E. Hazy,et al. Towards an executive without a homunculus: computational models of the prefrontal cortex/basal ganglia system , 2007, Philosophical Transactions of the Royal Society B: Biological Sciences.

[7] Inman Harvey,et al. The Microbial Genetic Algorithm , 2009, ECAL.

[8] Andrew Y. Ng,et al. Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[9] P. Husbands,et al. Evolvable Neuronal Paths: A Novel Basis for Information and Search in the Brain , 2011, PloS one.

[10] Peter Stone,et al. An Introduction to Intertask Transfer for Reinforcement Learning , 2011, AI Mag..

[11] Eörs Szathmáry,et al. Selectionist and Evolutionary Approaches to Brain Function: A Critical Appraisal , 2012, Front. Comput. Neurosci..

[12] Dario Floreano,et al. Online Extreme Evolutionary Learning Machines , 2014 .

[13] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[14] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[15] Tianqi Chen,et al. Net2Net: Accelerating Learning via Knowledge Transfer , 2015, ICLR.

[16] Jakob Verbeek,et al. Convolutional Neural Fabrics , 2016, NIPS.

[17] Benjamin Van Roy,et al. Deep Exploration via Bootstrapped DQN , 2016, NIPS.

[18] Razvan Pascanu,et al. Progressive Neural Networks , 2016, ArXiv.

[19] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[20] David A. Forsyth,et al. Swapout: Learning an ensemble of deep architectures , 2016, NIPS.

[21] Geoffrey E. Hinton,et al. Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer , 2017, ICLR.

[22] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.