A New Learning Automata-Based Pruning Method to Train Deep Neural Networks

Deep neural network are one of the most powerful model for machine learning, which can learn the underlying patterns automatically from a large amount of data. So it can be extensively used in more and more Internet-of-Things (IoT) applications. However, the training of deep models is difficult, suffering from overfitting and gradient vanishing problem. Besides, the large amount of parameters and multiplication operations make it impractical for most deep learning models to directly execute on target hardware. In this paper, we propose a method of gradually pruning the weakly connected weights to improve the traditional stochastic gradient descent. And we adopt a reinforcement learning method called learning automata to find the weakly connected weights on account of its strong policy-making ability in stochastic and nonstationary environment. Our proposed method can learn a more effective and sparsely connected architecture during training from the initially fully connected neural networks. The experiments on MNIST show that our method have stronger power to defeat overfitting and can get better generalization performance on test set. Meanwhile, the thin and sparsely connected model we get can be more suitable for IoT applications.

[1]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[2]  Nicholas D. Lane,et al.  An Early Resource Characterization of Deep Learning on Wearables, Smartphones and Internet-of-Things Devices , 2015, IoT-App@SenSys.

[3]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[4]  Geoffrey E. Hinton,et al.  On the importance of initialization and momentum in deep learning , 2013, ICML.

[5]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[6]  Hongyu Guo,et al.  Long Short-Term Memory Over Recursive Structures , 2015, ICML.

[7]  Zhuowen Tu,et al.  Deeply-Supervised Nets , 2014, AISTATS.

[8]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[9]  Tara N. Sainath,et al.  Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.

[10]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[11]  P. S. Sastry,et al.  Varieties of learning automata: an overview , 2002, IEEE Trans. Syst. Man Cybern. Part B.

[12]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[14]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[15]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[16]  Derek C. Rose,et al.  Deep Machine Learning - A New Frontier in Artificial Intelligence Research [Research Frontier] , 2010, IEEE Computational Intelligence Magazine.

[17]  Mohammad Reza Meybodi,et al.  Applying continuous action reinforcement learning automata(CARLA) to global training of hidden Markov models , 2004, International Conference on Information Technology: Coding and Computing, 2004. Proceedings. ITCC 2004..

[18]  M. L. Tsetlin,et al.  Automaton theory and modeling of biological systems , 1973 .

[19]  Mohammad Reza Meybodi,et al.  LA-Mobicast: A Learning Automata Based Mobicast Routing Protocol for Wireless Sensor Networks , 2008 .

[20]  Jitendra Malik,et al.  Region-Based Convolutional Networks for Accurate Object Detection and Segmentation , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  B. John Oommen,et al.  Solving Multiconstraint Assignment Problems Using Learning Automata , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[22]  D. E. Rumelhart,et al.  Learning internal representations by back-propagating errors , 1986 .

[23]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[24]  Marimuthu Palaniswami,et al.  Internet of Things (IoT): A vision, architectural elements, and future directions , 2012, Future Gener. Comput. Syst..

[25]  Geoffrey E. Hinton,et al.  Simplifying Neural Networks by Soft Weight-Sharing , 1992, Neural Computation.