Dropout algorithms for recurrent neural networks

In the last decade, hardware advancements have allowed for neural networks to become much larger in size. Dropout is a popular deep learning technique which has shown to improve the performance of large neural networks. Recurrent neural networks are powerful networks specialised at solving problems which use time series data. Three different approaches to incorporating Dropout with recurrent neural networks have been suggested. However, these approaches have not been evaluated under identical experimental conditions. This article investigates the performance of these Dropout approaches using a 2D physics simulation benchmark. After applying statistical tests it was found that using Dropout did improve network performance on the benchmark. However, contrary to the literature, the Dropout approach which was expected to perform poorly, performed well, and the approach which was expected to perform well, performed poorly.

[1]  Wojciech Zaremba,et al.  Recurrent Neural Network Regularization , 2014, ArXiv.

[2]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[3]  Andrew L. Maas Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .

[4]  Ilya Sutskever,et al.  Learning Recurrent Neural Networks with Hessian-Free Optimization , 2011, ICML.

[5]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[6]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Patrick van der Smagt,et al.  Introduction to neural networks , 1995, The Lancet.

[8]  Jürgen Schmidhuber,et al.  A Clockwork RNN , 2014, ICML.

[9]  Maneesh Sahani,et al.  Regularization and nonlinearities for neural language models: when are they needed? , 2013, ArXiv.

[10]  Sachin S. Talathi,et al.  Improving performance of recurrent neural network with relu nonlinearity , 2015, ArXiv.

[11]  Alex Graves,et al.  Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.

[12]  Geoffrey E. Hinton,et al.  A Simple Way to Initialize Recurrent Networks of Rectified Linear Units , 2015, ArXiv.

[13]  Klaus-Robert Müller,et al.  Efficient BackProp , 2012, Neural Networks: Tricks of the Trade.

[14]  Hava T. Siegelmann,et al.  On the Computational Power of Neural Nets , 1995, J. Comput. Syst. Sci..

[15]  Christian Osendorfer,et al.  On Fast Dropout and its Applicability to Recurrent Networks , 2013, ICLR.

[16]  Zoubin Ghahramani,et al.  A Theoretically Grounded Application of Dropout in Recurrent Neural Networks , 2015, NIPS.