A Novel Connectionist System for Unconstrained Handwriting Recognition

Recognizing lines of unconstrained handwritten text is a challenging task. The difficulty of segmenting cursive or overlapping characters, combined with the need to exploit surrounding context, has led to low recognition rates for even the best current recognizers. Most recent progress in the field has been made either through improved preprocessing or through advances in language modeling. Relatively little work has been done on the basic recognition algorithms. Indeed, most systems rely on the same hidden Markov models that have been used for decades in speech and handwriting recognition, despite their well-known shortcomings. This paper proposes an alternative approach based on a novel type of recurrent neural network, specifically designed for sequence labeling tasks where the data is hard to segment and contains long-range bidirectional interdependencies. In experiments on two large unconstrained handwriting databases, our approach achieves word recognition accuracies of 79.7 percent on online data and 74.1 percent on offline data, significantly outperforming a state-of-the-art HMM-based system. In addition, we demonstrate the network's robustness to lexicon size, measure the individual influence of its hidden layers, and analyze its use of context. Last, we provide an in-depth discussion of the differences between the network and HMMs, suggesting reasons for the network's superior performance.

[1]  Kenneth M. Sayre,et al.  Machine recognition of handwritten words: A project report , 1973, Pattern Recognit..

[2]  Geoffrey Leech,et al.  The tagged LOB Corpus : user's manual , 1986 .

[3]  John Scott Bridle,et al.  Probabilistic Interpretation of Feedforward Classification Network Outputs, with Relationships to Statistical Pattern Recognition , 1989, NATO Neurocomputing.

[4]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[5]  Steve Young,et al.  Token passing: a simple conceptual model for connected speech recognition systems , 1989 .

[6]  Ching Y. Suen,et al.  The State of the Art in Online Handwriting Recognition , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Ronald J. Williams,et al.  Gradient-based learning algorithms for recurrent connectionist networks , 1990 .

[8]  Hervé Bourlard,et al.  Connectionist Speech Recognition: A Hybrid Approach , 1993 .

[9]  Lambert Schomaker,et al.  Using stroke- or character-based self-organizing maps in the recognition of on-line, connected cursive script , 1993, Pattern Recognit..

[10]  Isabelle Guyon,et al.  On-line cursive script recognition using time-delay neural networks and hidden Markov models , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[11]  Isabelle Guyon,et al.  UNIPEN project of on-line data exchange and recognizer benchmarks , 1994, Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 3 - Conference C: Signal Processing (Cat. No.94CH3440-5).

[12]  Richard M. Schwartz,et al.  On-line cursive handwriting recognition using speech recognition methods , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[13]  Yoshua Bengio,et al.  Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[14]  Nikolaos G. Bourbakis,et al.  Handwriting recognition using a reduced character method and neural nets , 1995, Electronic Imaging.

[15]  Markus Schenkel,et al.  Off-line cursive handwriting recognition compared with on-line recognition , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[16]  Gordon T. Wilfong,et al.  On-Line Recognition of Handwritten Symbols , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[18]  Kuldip K. Paliwal,et al.  Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[19]  Anthony J. Robinson,et al.  An Off-Line Cursive Handwriting Recognition System , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  Yoshinori Sagisaka,et al.  Phoneme boundary estimation using bidirectional recurrent neural networks and its applications , 1999, Systems and Computers in Japan.

[21]  Giovanni Soda,et al.  Exploiting the past and the future in protein secondary structure prediction , 1999, Bioinform..

[22]  Gerhard Rigoll,et al.  Performance evaluation of a new hybrid modeling technique for handwriting recognition using identical on-line and off-line data , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[23]  Robert Sabourin,et al.  An HMM-Based Approach for Off-Line Unconstrained Handwritten Word Modeling and Recognition , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  Jianying Hu,et al.  Writer independent on-line handwriting recognition using an HMM approach , 2000, Pattern Recognit..

[25]  Alessandro Vinciarelli,et al.  A Survey On Off-Line Cursive Script Recognition , 2000 .

[26]  Sargur N. Srihari,et al.  On-Line and Off-Line Handwriting Recognition: A Comprehensive Survey , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[27]  Alex Acero,et al.  Spoken Language Processing: A Guide to Theory, Algorithm and System Development , 2001 .

[28]  Bernadette Dorizzi,et al.  Sentence recognition through hybrid neuro-Markovian modeling , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[29]  Giovanni Soda,et al.  Bidirectional Dynamics for Protein Secondary Structure Prediction , 2001, Sequence Learning.

[30]  Yoshua Bengio,et al.  Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-Term Dependencies , 2001 .

[31]  Alexander H. Waibel,et al.  Online handwriting recognition: the NPen++ recognizer , 2001, International Journal on Document Analysis and Recognition.

[32]  Horst Bunke,et al.  Using a Statistical Language Model to Improve the Performance of an HMM-Based Cursive Handwriting Recognition System , 2001, Int. J. Pattern Recognit. Artif. Intell..

[33]  Jürgen Schmidhuber,et al.  Learning Precise Timing with LSTM Recurrent Networks , 2003, J. Mach. Learn. Res..

[34]  Horst Bunke,et al.  The IAM-database: an English sentence database for offline handwriting recognition , 2002, International Journal on Document Analysis and Recognition.

[35]  Claus Bahlmann,et al.  Online handwriting recognition with support vector machines - a kernel approach , 2002, Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition.

[36]  Alessandro Vinciarelli,et al.  A survey on off-line Cursive Word Recognition , 2002, Pattern Recognit..

[37]  Nikos Fakotakis,et al.  An unconstrained handwriting recognition system , 2002, International Journal on Document Analysis and Recognition.

[38]  Darren Moore,et al.  The IDIAP Smart Meeting Room , 2002 .

[39]  Horst Bunke,et al.  Recognition of cursive Roman handwriting: past, present and future , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[40]  Claus Bahlmann,et al.  The writer independent online handwriting recognition system frog on hand and cluster generative statistical dynamic time warping , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Yoshua Bengio,et al.  Markovian Models for Sequential Data , 2004 .

[42]  Jürgen Schmidhuber,et al.  Framewise phoneme classification with bidirectional LSTM and other neural network architectures , 2005, Neural Networks.

[43]  Jürgen Schmidhuber,et al.  Bidirectional LSTM Networks for Improved Phoneme Classification and Recognition , 2005, ICANN.

[44]  Christian Viard-Gaudin,et al.  MS-TDNN with global discriminant trainings , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[45]  J. Schmidhuber,et al.  Framewise phoneme classification with bidirectional LSTM networks , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[46]  M. T. Johnson,et al.  Capacity and complexity of HMM duration modeling techniques , 2005, IEEE Signal Processing Letters.

[47]  Marcus Liwicki,et al.  IAM-OnDB - an on-line English sentence database acquired from handwritten text on a whiteboard , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[48]  Jean-Cédric Chappelier,et al.  Offline grammar-based recognition of handwritten sentences , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[49]  Marcus Liwicki,et al.  HMM-Based On-Line Recognition of Handwritten Whiteboard Notes , 2006 .

[50]  Gerhard Rigoll,et al.  Novel Hybrid NN/HMM Modelling Techniques for On-line Handwriting Recognition , 2006 .

[51]  Jürgen Schmidhuber,et al.  Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.

[52]  James A. Pittman,et al.  Handwriting Recognition: Tablet PC Text Input , 2007, Computer.

[53]  Jürgen Schmidhuber,et al.  Unconstrained On-line Handwriting Recognition with Recurrent Neural Networks , 2007, NIPS.

[54]  A. Graves,et al.  Unconstrained Online Handwriting Recognition with Recurrent Neural Networks , 2007 .