Offline continuous handwriting recognition using sequence to sequence neural networks

Abstract This paper proposes the use of a new neural network architecture that combines a deep convolutional neural network with an encoder–decoder, called sequence to sequence, to solve the problem of recognizing isolated handwritten words. The proposed architecture aims to identify the characters and contextualize them with their neighbors to recognize any given word. Our model proposes a novel way to extract relevant visual features from a word image. It combines the use of a horizontal sliding window, to extract image patches, and the application of the LeNet-5 convolutional architecture to identify the characters. Extracted features are modeled using a sequence-to-sequence architecture to encode the visual characteristics and then to decode the sequence of characters in the handwritten text image. We test the proposed model on two handwritten databases (IAM and RIMES) under several experiments to determine the optimal parameterization of the model. Competitive results above those presented in the current state-of-the-art, on handwriting models, are achieved. Without using any language model and with closed dictionary, we obtain a word error rate in the test set of 12.7% in IAM and 6.6% in RIMES.

[1]  Hermann Ney,et al.  Tandem HMM with convolutional neural network for handwritten word recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[2]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[3]  Tao Wang,et al.  End-to-end text recognition with convolutional neural networks , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[4]  Jürgen Schmidhuber,et al.  Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.

[5]  Théodore Bluche,et al.  Joint Line Segmentation and Transcription for End-to-End Handwritten Paragraph Recognition , 2016, NIPS.

[6]  Jürgen Schmidhuber,et al.  Multidimensional Recurrent Neural Networks , 2007 .

[7]  Ángel Sánchez,et al.  Using a Synthetic Character Database for Training Deep Learning Models Applied to Offline Handwritten Recognition , 2016, ISDA.

[8]  Yoshua Bengio,et al.  Attention-Based Models for Speech Recognition , 2015, NIPS.

[9]  Ernest Valveny,et al.  Word Spotting and Recognition with Embedded Attributes , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Jürgen Schmidhuber,et al.  Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks , 2008, NIPS.

[11]  Lior Wolf,et al.  CNN-N-Gram for HandwritingWord Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Horst Bunke,et al.  The IAM-database: an English sentence database for offline handwriting recognition , 2002, International Journal on Document Analysis and Recognition.

[13]  Gernot A. Fink,et al.  Markov models for offline handwriting recognition: a survey , 2009, International Journal on Document Analysis and Recognition (IJDAR).

[14]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[15]  Emmanuel Augustin,et al.  RIMES evaluation campaign for handwritten mail processing , 2006 .

[16]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[17]  Joon Son Chung,et al.  Lip Reading in the Wild , 2016, ACCV.

[18]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[19]  Kenneth M. Sayre,et al.  Machine recognition of handwritten words: A project report , 1973, Pattern Recognit..

[20]  Clément Chatelain,et al.  Spotting handwritten words and REGEX using a two stage BLSTM-HMM architecture , 2015, Electronic Imaging.

[21]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[22]  Sargur N. Srihari,et al.  On-Line and Off-Line Handwriting Recognition: A Comprehensive Survey , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[23]  Hermann Ney,et al.  Handwriting Recognition with Large Multidimensional Long Short-Term Memory Recurrent Neural Networks , 2016, 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR).

[24]  Yajie Liu,et al.  Offline handwritten English character recognition based on convolutional neural network , 2012, 2012 10th IAPR International Workshop on Document Analysis Systems.

[25]  Christopher Kermorvant,et al.  Dropout Improves Recurrent Neural Networks for Handwriting Recognition , 2013, 2014 14th International Conference on Frontiers in Handwriting Recognition.

[26]  Verónica Romero,et al.  Handwritten text recognition for historical documents in the transcriptorium project , 2014, DATeCH '14.

[27]  Haikal El Abed,et al.  ICDAR 2009 Handwriting Recognition Competition , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[28]  Gyeonghwan Kim,et al.  A Lexicon Driven Approach to Handwritten Word Recognition for Real-Time Applications , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[29]  Hermann Ney,et al.  Fast and Robust Training of Recurrent Neural Networks for Offline Handwriting Recognition , 2014, 2014 14th International Conference on Frontiers in Handwriting Recognition.

[30]  E. Sreenivasa Reddy,et al.  Role of Offline Handwritten Character Recognition System in Various Applications , 2016 .

[31]  Hermann Ney,et al.  Feature Extraction with Convolutional Neural Networks for Handwritten Word Recognition , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[32]  Hermann Ney,et al.  Improvements in RWTH's System for Off-Line Handwriting Recognition , 2013, 2013 12th International Conference on Document Analysis and Recognition.