Where to apply dropout in recurrent neural networks for handwriting recognition?

The dropout technique is a data-driven regularization method for neural networks. It consists in randomly setting some activations from a given hidden layer to zero during training. Repeating the procedure for each training example, it is equivalent to sample a network from an exponential number of architectures that share weights. The goal of dropout is to prevent feature detectors to rely on each other. Dropout has successfully been applied to Deep MLPs and to convolutional neural networks, for various tasks of Speech Recognition and Computer Vision. We recently proposed a way to use dropout in MDLSTM-RNNs for handwritten word and line recognition. In this paper, we show that further improvement can be achieved by implementing dropout differently, more specifically by applying it at better positions relative to the LSTM units.

[1]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[2]  Richard Buse,et al.  A structural and relational approach to handwritten word recognition , 1997, IEEE Trans. Syst. Man Cybern. Part B.

[3]  Horst Bunke,et al.  The IAM-database: an English sentence database for offline handwriting recognition , 2002, International Journal on Document Analysis and Recognition.

[4]  Emmanuel Augustin,et al.  RIMES evaluation campaign for handwritten mail processing , 2006 .

[5]  Jürgen Schmidhuber,et al.  Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.

[6]  Mehryar Mohri,et al.  Speech Recognition with Weighted Finite-State Transducers , 2008 .

[7]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[8]  Daniel Povey,et al.  The Kaldi Speech Recognition Toolkit , 2011 .

[9]  Chafic Mokbel,et al.  Dynamic and Contextual Information in HMM Modeling for Handwritten Word Recognition , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[11]  Li Deng,et al.  A deep convolutional neural network using heterogeneous pooling for trading acoustic invariance with phonetic confusion , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[12]  Hermann Ney,et al.  Improvements in RWTH's System for Off-Line Handwriting Recognition , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[13]  Yoshua Bengio,et al.  Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding , 2013, INTERSPEECH.

[14]  Pierre Baldi,et al.  Understanding Dropout , 2013, NIPS.

[15]  Christopher Kermorvant,et al.  The A2iA Arabic Handwritten Text Recognition System at the Open HaRT2013 Evaluation , 2014, 2014 11th IAPR International Workshop on Document Analysis Systems.

[16]  Hermann Ney,et al.  Fast and Robust Training of Recurrent Neural Networks for Offline Handwriting Recognition , 2014, 2014 14th International Conference on Frontiers in Handwriting Recognition.

[17]  Alejandro Héctor Toselli,et al.  ICFHR2014 Competition on Handwritten Text Recognition on Transcriptorium Datasets (HTRtS) , 2014, 2014 14th International Conference on Frontiers in Handwriting Recognition.

[18]  Christopher Kermorvant,et al.  The A2iA Multi-lingual Text Recognition System at the Second Maurdor Evaluation , 2014, 2014 14th International Conference on Frontiers in Handwriting Recognition.

[19]  Christopher Kermorvant,et al.  Dropout Improves Recurrent Neural Networks for Handwriting Recognition , 2013, 2014 14th International Conference on Frontiers in Handwriting Recognition.

[20]  Wojciech Zaremba,et al.  Recurrent Neural Network Regularization , 2014, ArXiv.

[21]  Hermann Ney,et al.  The LIMSI handwriting recognition system for the HTRtS 2014 contest , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).