Dropout Improves Recurrent Neural Networks for Handwriting Recognition

Recurrent neural networks (RNNs) with Long Short-Term memory cells currently hold the best known results in unconstrained handwriting recognition. We show that their performance can be greatly improved using dropout - a recently proposed regularization method for deep architectures. While previous works showed that dropout gave superior performance in the context of convolutional networks, it had never been applied to RNNs. In our approach, dropout is carefully used in the network so that it does not affect the recurrent connections, hence the power of RNNs in modeling sequences is preserved. Extensive experiments on a broad range of handwritten databases confirm the effectiveness of dropout on deep architectures even when the network mainly consists of recurrent and shared connections.

[1]  Yoshua Bengio,et al.  Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[2]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[3]  Anthony J. Robinson,et al.  An Off-Line Cursive Handwriting Recognition System , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Sepp Hochreiter,et al.  The Vanishing Gradient Problem During Learning Recurrent Neural Nets and Problem Solutions , 1998, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[5]  Sargur N. Srihari,et al.  On-Line and Off-Line Handwriting Recognition: A Comprehensive Survey , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Bernadette Dorizzi,et al.  Sentence recognition through hybrid neuro-Markovian modeling , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[7]  Horst Bunke,et al.  Using a Statistical Language Model to Improve the Performance of an HMM-Based Cursive Handwriting Recognition System , 2001, Int. J. Pattern Recognit. Artif. Intell..

[8]  Horst Bunke,et al.  The IAM-database: an English sentence database for offline handwriting recognition , 2002, International Journal on Document Analysis and Recognition.

[9]  Michael I. Jordan,et al.  Factorial Hidden Markov Models , 1995, Machine Learning.

[10]  Jürgen Schmidhuber,et al.  Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.

[11]  Horst Bunke,et al.  Hidden Markov model-based ensemble methods for offline handwritten text line recognition , 2008, Pattern Recognit..

[12]  T. Munich,et al.  Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks , 2008, NIPS.

[13]  J. Schmidhuber,et al.  A Novel Connectionist System for Unconstrained Handwriting Recognition , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Haikal El Abed,et al.  ICDAR 2009 Handwriting Recognition Competition , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[15]  Hermann Ney,et al.  Hierarchical hybrid MLP/HMM or rather MLP features for a discriminatively trained Gaussian HMM: A comparison for offline handwriting recognition , 2011, 2011 18th IEEE International Conference on Image Processing.

[16]  Salvador España Boquera,et al.  Improving Offline Handwritten Text Recognition with Hybrid HMM/ANN Models , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Christopher Kermorvant,et al.  The A2iA French handwriting recognition system at the Rimes-ICDAR2011 competition , 2012, Electronic Imaging.

[18]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[19]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[20]  Tara N. Sainath,et al.  Improving deep neural networks for LVCSR using rectified linear units and dropout , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[21]  Christopher Kermorvant,et al.  Handwritten Information Extraction from Historical Census Documents , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[22]  Yongqiang Wang,et al.  An investigation of deep neural networks for noise robust speech recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[23]  Li Deng,et al.  A deep convolutional neural network using heterogeneous pooling for trading acoustic invariance with phonetic confusion , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[24]  Yann LeCun,et al.  Regularization of Neural Networks using DropConnect , 2013, ICML.

[25]  Jie Li,et al.  Understanding the dropout strategy and analyzing its effectiveness on LVCSR , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[26]  Yoshua Bengio,et al.  Maxout Networks , 2013, ICML.

[27]  Hermann Ney,et al.  Improvements in RWTH's System for Off-Line Handwriting Recognition , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[28]  Yoshua Bengio,et al.  Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding , 2013, INTERSPEECH.

[29]  Christopher D. Manning,et al.  Fast dropout training , 2013, ICML.

[30]  Christopher Kermorvant,et al.  The A2iA Arabic Handwritten Text Recognition System at the Open HaRT2013 Evaluation , 2014, 2014 11th IAPR International Workshop on Document Analysis Systems.

[31]  Hermann Ney,et al.  Multilingual Off-Line Handwriting Recognition in Real-World Images , 2014, 2014 11th IAPR International Workshop on Document Analysis Systems.

[32]  Christopher Kermorvant,et al.  Over-Generative Finite State Transducer N-Gram for Out-of-Vocabulary Word Recognition , 2014, 2014 11th IAPR International Workshop on Document Analysis Systems.