Distilling deep neural networks with reinforcement learning

Deep architecture can improve performance of neural networks whereas it increases the computational complexity. Compressing networks is the key to solve this problem. The framework Knowledge Distilling (KD) compresses cumbersome networks well. It improved mimic learning, enabling knowledge to be transferred from cumbersome networks to compressed networks without constraint of architectures. Inspired by AlphaGo Zero, this paper proposed an algorithm combining KD with reinforcement learning to compress networks on changing datasets. In this algorithm, the compressed networks interact with the environment made by KD to produce datasets that are appropriate w.r.t the model. Monte Carlo Tree Search (MCTS) of AlphaGo Zero is used to produce the datasets by making a trade-off between the prediction of compressed networks and the knowledge. In experiments, the algorithm proved to be effective in compressing networks by training ResNet on CIFAR datasets, with mean squared error as the object function.

[1]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[2]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[3]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[4]  Yoshua Bengio,et al.  FitNets: Hints for Thin Deep Nets , 2014, ICLR.

[5]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[6]  Rich Caruana,et al.  Do Deep Nets Really Need to be Deep? , 2013, NIPS.

[7]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[9]  Csaba Szepesvári,et al.  Algorithms for Reinforcement Learning , 2010, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[10]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[11]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.