Fast multi-language LSTM-based online handwriting recognition

We describe an online handwriting system that is able to support 102 languages using a deep neural network architecture. This new system has completely replaced our previous segment-and-decode-based system and reduced the error rate by 20–40% relative for most languages. Further, we report new state-of-the-art results on IAM-OnDB for both the open and closed dataset setting. The system combines methods from sequence recognition with a new input encoding using Bézier curves. This leads to up to $$10\times $$ 10 × faster recognition times compared to our previous system. Through a series of experiments, we determine the optimal configuration of our models and report the results of our setup on a number of additional public datasets.

[1]  Yoshua Bengio,et al.  Drawing and Recognizing Chinese Characters with Recurrent Neural Network , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Venu Govindaraju,et al.  IBM_UB_1: A Dual Mode Unconstrained English Handwriting Dataset , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[3]  Wojciech Zaremba,et al.  An Empirical Exploration of Recurrent Network Architectures , 2015, ICML.

[4]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[5]  Masaki Nakagawa,et al.  The state of the art in Japanese online handwriting recognition compared to techniques in western handwriting recognition , 2003, Document Analysis and Recognition.

[6]  Geoffrey E. Hinton,et al.  Phoneme recognition using time-delay neural networks , 1989, IEEE Trans. Acoust. Speech Signal Process..

[7]  Hung Tuan Nguyen,et al.  ICFHR 2018 – Competition on Vietnamese Online Handwritten Text Recognition using HANDS-VNOnDB (VOHTR2018) , 2018, 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR).

[8]  Alexander H. Waibel,et al.  Online handwriting recognition: the NPen++ recognizer , 2001, International Journal on Document Analysis and Recognition.

[9]  Tara N. Sainath,et al.  Convolutional, Long Short-Term Memory, fully connected Deep Neural Networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[10]  Tao Wang,et al.  End-to-end text recognition with convolutional neural networks , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[11]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Jin Hyung Kim,et al.  Online Handwriting Recognition , 2014, Handbook of Document Image Processing and Recognition.

[13]  Sargur N. Srihari,et al.  On-Line and Off-Line Handwriting Recognition: A Comprehensive Survey , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  L. Prasanth,et al.  HMM-Based Online Handwriting Recognition System for Telugu Symbols , 2007 .

[15]  Christopher Kermorvant,et al.  Dropout Improves Recurrent Neural Networks for Handwriting Recognition , 2013, 2014 14th International Conference on Frontiers in Handwriting Recognition.

[16]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[17]  Navdeep Jaitly,et al.  Towards End-To-End Speech Recognition with Recurrent Neural Networks , 2014, ICML.

[18]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[19]  D. Sculley,et al.  Google Vizier: A Service for Black-Box Optimization , 2017, KDD.

[20]  Daan van Esch,et al.  Mining Training Data for Language Modeling Across the World's Languages , 2018, SLTU.

[21]  Simon Haykin,et al.  GradientBased Learning Applied to Document Recognition , 2001 .

[22]  Volkmar Frinken,et al.  Deep BLSTM neural networks for unconstrained continuous handwritten text recognition , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[23]  Jun Du,et al.  A GRU-Based Encoder-Decoder Approach with Attention for Online Handwritten Mathematical Expression Recognition , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[24]  Jürgen Schmidhuber,et al.  LSTM recurrent networks learn simple context-free and context-sensitive languages , 2001, IEEE Trans. Neural Networks.

[25]  A. Waibel,et al.  Connectionist Viterbi training: a new hybrid method for continuous speech recognition , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[26]  Marcus Liwicki,et al.  IAM-OnDB - an on-line English sentence database acquired from handwritten text on a whiteboard , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[27]  Yoshua Bengio,et al.  LeRec: A NN/HMM Hybrid for On-Line Handwriting Recognition , 1995, Neural Computation.

[28]  Fei Yin,et al.  ICDAR 2011 Chinese Handwriting Recognition Competition , 2011, 2011 International Conference on Document Analysis and Recognition.

[29]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[30]  Thorsten Brants,et al.  Large Language Models in Machine Translation , 2007, EMNLP.

[31]  Daan van Esch,et al.  Text Normalization Infrastructure that Scales to Hundreds of Language Varieties , 2018, LREC.

[32]  Kuldip K. Paliwal,et al.  Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[33]  Tara N. Sainath,et al.  Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.

[34]  Umapada Pal,et al.  Improved BLSTM Neural Networks for Recognition of On-Line Bangla Complex Words , 2014, S+SSPR.

[35]  Victor Carbune,et al.  Multi-Language Online Handwriting Recognition , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Natasha Dejdumrong,et al.  Approximating Online Handwritten Image by Bézier Curve , 2012, 2012 Ninth International Conference on Computer Graphics, Imaging and Visualization.

[37]  Fei Yin,et al.  Online and offline handwritten Chinese character recognition: Benchmarking on new databases , 2013, Pattern Recognit..

[38]  Richard F. Lyon,et al.  Combining Neural Networks and Context-Driven Search for On-Line, Printed Handwriting Recognition in the Newton , 1996, Neural Networks: Tricks of the Trade.

[39]  Marcus Liwicki,et al.  Combining diverse systems for handwritten text line recognition , 2009, Machine Vision and Applications.

[40]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[41]  Jun Sun,et al.  Accelerating and Compressing LSTM Based Model for Online Handwritten Chinese Character Recognition , 2018, 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR).

[42]  Anirudha Joshi,et al.  Text Entry in Indian Languages on Mobile: User Perspectives , 2014, IHCI.

[43]  Karel Driesen,et al.  Sequence-to-Label Script Identification for Multilingual OCR , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[44]  HuJianying,et al.  HMM Based On-Line Handwriting Recognition , 1996 .

[45]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[46]  Tara N. Sainath,et al.  Deep Convolutional Neural Networks for Large-scale Speech Tasks , 2015, Neural Networks.

[47]  Jürgen Schmidhuber,et al.  Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.

[48]  Richard F. Lyon,et al.  Combining Neural Networks and Context-Driven Search for Online, Printed Handwriting Recognition in the NEWTON , 1998, AI Mag..

[49]  M. Tahar Kechadi,et al.  Preprocessing Techniques for Online Handwriting Recognition , 2009, Intelligent Text Categorization and Clustering.

[50]  T. Munich,et al.  Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks , 2008, NIPS.

[51]  Jürgen Schmidhuber,et al.  Unconstrained On-line Handwriting Recognition with Recurrent Neural Networks , 2007, NIPS.

[52]  James A. Pittman,et al.  Handwriting Recognition: Tablet PC Text Input , 2007, Computer.